CustomSecondaryIndex (sdn-apidoc 2.6.8 API)

Type Parameters:
K - type of the value (or composite values) of the indexed column (or columns) in the main column family. This value (or composite value) will become the row key in the secondary index column.
C - type of the column name in the secondary index column family (row key in the main column family or composite value when sorting information is included).
D - type of the denormalized data to set as the value in the indexed columns.

All Superinterfaces:

ColumnFamilyHandler

All Known Implementing Classes:

GenericCustomSecondaryIndex
```
public interface CustomSecondaryIndex<K extends Serializable,C extends Serializable & Comparable<C>,D>
extends ColumnFamilyHandler
```
Custom secondary index column family that keep rows from another column family (or main column family) as columns in wide rows.
Cassandra already provides native index, however is like a hashed index which means you can only do equality query and not range query. One advantage though is that Cassandra's native secondary indexes already handle updates.
Cassandra's index is only recommended for attributes with low cardinality, i.e. attributes that have few unique values e.g., color of a product.
When a column family is created to keep custom secondary indexes, the row key in such column family would typically be a value of the indexed column and the column names would be the row keys in the main column family that has such value set plus any attribute included for sorting. The column value will include any denormalized data.
When working with secondary indexes, one of two strategies is typically employed: either the column values (or names if composite column namme is used) contain row keys pointing to a separate column family which contains the actual data, or the complete (or partial) set of data for each entity is stored in the secondary index itself (denormalization). With the first strategy, which is similar to building an index, you first fetch a set of row keys from a index and then multiget the matching data rows from a separate column family. This approach is appealing to many at first because it is more normalized; it allows for easy updates of entities, doesn't require you to repeat the same data in multiple column families. However, the second step of the data fetching process, the multiget, is fairly expensive and slow. It requires querying many nodes where each node will need to perform many disk seeks to fetch the rows if they aren't well cached. This approach will not scale well with large data sets.
Examples:
```
 column_family_secondary_index {
     "indexed_value_1": {
         id_i: <data provided by the denormalizer>,
         ...
         id_j: <data provided by the denormalizer>,
     }
     "indexed_value_2": {
         id_m: <data provided by the denormalizer>,
         ...
         id_n: <data provided by the denormalizer>,
     }
 }
 
 column_family_persons_by_status {
     "Single": {
         person_id_i: <Name, Last Name>,
         ...
         person_id_j: <Name, Last Name>,
     }
     "Married": {
         person_id_m: <Name, Last Name>,
         ...
         person_id_n: <Name, Last Name>,
     }
 }
 
```
If the columns need to be sorted by a different attribute the attribute is included as part of the column name (composite column name). This does not affect the denormalizer.
Example:
```
 column_family_persons_by_status_sorted_by_last_name_and_name {
     "Single": {
         <last_name_i, name_i, person_id_i>: <Birthdate_i>,
         ...
         <last_name_j, name_j, person_id_j>: <Birthdate_j>,
     }
     "Married": {
         <last_name_m, name_m, person_id_m>: <Birthdate_m>,
         ...
         <last_name_n, name_n, person_id_n>: <Birthdate_n>,
     }
 }
 
```
Author:

Fabiel Zuniga

Method Summary

Methods
Modifier and Type	Method and Description
`void`	`clear(DataStoreContext context)` Updates the index after deleting all rows from the main column family.
`long`	`count(K indexKey, DataStoreContext context)` Counts the number of columns in the index.
`void`	`delete(C indexEntry, K indexKey, DataStoreContext context)` Updates the index after a row has been deleted from the main column family.
`void`	`delete(K indexKey, DataStoreContext context)` Deletes a row in the secondary index column family.
`void`	`insert(C indexEntry, D denormalizedData, K indexKey, DataStoreContext context)` Updates the index after a row has been inserted into the main column family.
`List<Column<C,D>>`	`read(K indexKey, DataStoreContext context)` Reads the index entries.
`List<Column<C,D>>`	`read(List<C> indexEntries, K indexKey, DataStoreContext context)` Reads the index entries.

Methods inherited from interface com.hp.util.persistence.cassandra.ColumnFamilyHandler
getColumnFamilyDefinitions

- Method Detail
  - insert
```
void insert(C indexEntry,
          D denormalizedData,
          K indexKey,
          DataStoreContext context)
```
    Updates the index after a row has been inserted into the main column family.
    
    Parameters:
    indexEntry - row key in the main column family (or composite value when sorting information is included). Such row key (or composite value) will become the name of the column in the secondary index.
    denormalizedData - denormalized data to include as part of the indexed columns. null if no denormalization is used.
    indexKey - value (or composite values) of the indexed column (or columns) in the main column family. This value (or composite value) will become the row key in the secondary index column.
    context - data store context.
  - delete
```
void delete(C indexEntry,
          K indexKey,
          DataStoreContext context)
```
    Updates the index after a row has been deleted from the main column family.
    
    Parameters:
    indexEntry - key of the row removed from the main column family.
    indexKey - value of the indexed column in the main column family.
    context - data store context.
  - delete
```
void delete(K indexKey,
          DataStoreContext context)
```
    Deletes a row in the secondary index column family.
    This method should be the preferred way to clear a secondary index if it contains a small well-know set of rows. Truncating a column family is an expensive operation.
    
    Parameters:
    indexKey - value of the indexed column in the main column family.
    context - data store context.
  - clear
```
void clear(DataStoreContext context)
```
    Updates the index after deleting all rows from the main column family.
    
    Parameters:
    context - data store context.
  - count
```
long count(K indexKey,
         DataStoreContext context)
```
    Counts the number of columns in the index.
    
    Parameters:
    indexKey - value of the indexed column in the main column family or row key in the secondary index column family.
    context - data store context.
    
    Returns:
    the number of entries in the index key.
  - read
```
List<Column<C,D>> read(K indexKey,
                     DataStoreContext context)
```
    Reads the index entries.
    
    Parameters:
    indexKey - value of the indexed column in the main column family or row key in the secondary index column family.
    context - data store context.
    
    Returns:
    the list of indexed columns with the denormalized data.
  - read
```
List<Column<C,D>> read(List<C> indexEntries,
                     K indexKey,
                     DataStoreContext context)
```
    Reads the index entries.
    An index is normally used to get rows (from the main column family) that match a specific indexed value, not to load entries known to match the indexed value - like in this method. This method has been defined to allow secondary indexes to be used by a SecondaryIndexIntegrator.SecondaryIndexReader.
    
    Parameters:
    indexEntries - index entries to read.
    indexKey - value of the indexed column in the main column family or row key in the secondary index column family.
    context - data store context.
    
    Returns:
    the list of indexed columns with the denormalized data. Any index entry from indexEntries that doesn't exist in the index given by indexKey is not included in the result.

Interface CustomSecondaryIndex<K extends Serializable,C extends Serializable & Comparable<C>,D>

Method Summary

Methods inherited from interface com.hp.util.persistence.cassandra.ColumnFamilyHandler

Method Detail

insert

delete

delete

clear

count

read

read