Cosmograph Data Kit
Data Kit utilities transform your data into Cosmograph-ready formats. They handle preparation, generate configurations, and provide statistical insights.
Data Kit functions
Three async functions for different use cases:
| Function | Purpose | Output |
|---|---|---|
prepareCosmographData() | Recommended. Prepare data as Arrow tables, ready to use immediately | Points/links as CosmographData, config, statistics |
prepareCosmographDataFiles() | Prepare data as binary files for storage or transfer | Points/links as Blob, config, statistics |
downloadCosmographData() | Prepare data and auto-download files | Data and JSON configuration downloaded, returns config and statistics |
Cosmograph Data Kit provides a log for the preparation process. If something goes wrong, you can find the error message in the browser console. It will also warn about columns that are missing from the data source or required columns that are not provided in the configuration.
Function arguments
| Parameter | Type | Description |
|---|---|---|
config | CosmographDataPrepConfig | Configuration for data preparation |
pointsData | CosmographInputData | Points data to process (Arrow Table, CSV, JSON, Parquet, or URL) |
linksData | CosmographInputData | Optional links data to process |
Both prepareCosmographDataFiles and downloadCosmographData support specifying the output format (.csv, .arrow, or .parquet) via outputFormat in the config. If not specified, defaults to .parquet.
Return value
All functions return a Promise with:
| Property | Description |
|---|---|
points* | Prepared points data in specified format |
links* | Prepared links data (when provided) |
cosmographConfig | Ready-to-use Cosmograph configuration |
pointsSummary | Stats: column info, aggregates (count, min, max, unique, avg, std), NULL percentages |
linksSummary | Stats for links |
pointsProcessedFully | Indicates whether the points dataset was processed fully without hitting memory limits |
linksProcessedFully | Indicates whether the links dataset was processed fully |
* Only in prepareCosmographData and prepareCosmographDataFiles. downloadCosmographData downloads files instead.
Configuration
Configure data preparation using CosmographDataPrepConfig interface that includes following properties:
| Property | Type | Description |
|---|---|---|
points | CosmographDataPrepPointsConfig | Configuration for the points table |
links | CosmographDataPrepLinksConfig | (Optional) Configuration for the links table |
outputFormat* | string | (Optional) Output format for prepared data: csv, arrow, or parquet. Defaults to parquet |
* outputFormat has no effect when using prepareCosmographData because it prepares data into the CosmographData format.
Points configuration
To prepare your points data for Cosmograph, you need to specify the required and optional properties in the points configuration object.
Required properties
You must provide either:
pointId: The column/field name that uniquely identifies each point in your dataset.
If your dataset doesnāt have a candidate for
pointIdcolumn and youāre not using links, you should providepointId: undefined. This will automatically generate columns with enumerated point ids and indexes for your data based on items count.
OR
linkSourceByandlinkTargetsBy: If you want to generate points from your links data, specify the column/field names containing the source and target identifiers of each link. This option only works if you also provide links data.
Optional properties
If you use a separate data source for points generation (not link-based), you can also include the following optional properties to enhance your graph:
| Property | Type | Description |
|---|---|---|
pointColorBy | string | Field with point colors (string or RGBA [r, g, b, a]). Pair with pointColorByFn for custom mappings |
pointSizeBy | string | Field with numeric values for sizes. Pair with pointSizeByFn for custom mappings |
pointLabelBy | string | Field with point labels (auto-displayed on graph). Pair with pointLabelFn for custom mappings |
pointLabelWeightBy | string | Field with float label weights (from 0 to 1). Higher = more visible. Pair with pointLabelWeightFn |
pointXBy | string | X-coordinate field. Use with pointYBy for fixed positions |
pointYBy | string | Y-coordinate field. Use with pointXBy for fixed positions |
pointIncludeColumns | string[] | Additional fields to include in other components, for custom behaviors or styles |
Links configuration
Required properties:
| Property | Description |
|---|---|
linkSourceBy | Column/field with link source identifiers |
linkTargetsBy | Column/field(s) with link target identifiers (array). Will be merged into one target column |
Optional properties:
| Property | Type | Description |
|---|---|---|
linkColorBy | string | Field with link colors. Pair with linkColorByFn for custom mappings |
linkWidthBy | string | Field with numeric values for link widths. Pair with linkWidthByFn |
linkArrowBy | string | Field with booleans (show arrow?). Pair with linkArrowByFn |
linkStrengthBy | string | Field with numeric link strength values. Pair with linkStrengthByFn |
linkIncludeColumns | string[] | Additional fields to include |
CSV-specific properties
For CSV inputs, additional properties csvParseTimeFormat and csvColumnTypesMap help handle special parsing cases.
This property only takes effect when the source data is in CSV format.
| Property | Description |
|---|---|
csvParseTimeFormat | Time format for CSV parsing (e.g., 'YYYY-MM-DD') |
csvColumnTypesMap | Map of column names to data types for CSV parsing when automatic parsing fails (e.g., { id: 'VARCHAR', value: 'FLOAT' }) |
See usage example here.
Configuration examples
Basic configuration
const config = {
points: {
pointIdBy: 'id', // Required: Unique identifier for each point
pointColorBy: 'color', // Optional: Color of the points
pointSizeBy: 'value', // Optional: Size of the points
},
links: {
linkSourceBy: 'source', // Required: Source of the link
linkTargetsBy: ['target'], // Required: Targets of the link
linkColorBy: 'color', // Optional: Color of the links
linkWidthBy: 'value', // Optional: Width of the links
},
}Generate points and links from links only
You can create points dataset for Cosmograph even if you have only one file with transactions data:
const config = {
points: {
linkSourceBy: 'source_column', // Column containing the link source
linkTargetsBy: ['target_column', 'target_column2'], // Columns containing the link targets
},
links: {
linkSourceBy: 'source_column',
linkTargetsBy: ['target_column', 'target_column2'],
// ... other link options
},
};Automatically generate point identifiers and indexes
Provide pointIdBy property with undefined value to automatically generate columns with enumerated point ids and indexes for your data by items count.
const config = {
points: {
pointIdBy: undefined,
},
};CSV with custom parsing
const config = {
points: {
pointId: 'id',
pointLabelBy: 'id',
pointSizeBy: 'comments',
pointIncludeColumns: ['date'],
csvParseTimeFormat: 'YYYY-MM-DD',
csvColumnTypesMap: {
id: 'VARCHAR',
comments: 'FLOAT',
date: 'DATE',
},
},
}Functions usage examples
Prepare data with Data Kit functions
This example only covers data preparing. See the next one for preparing and uploading data into Cosmograph.
import { downloadCosmographData, prepareCosmographData, prepareCosmographDataFiles } from '@cosmograph/cosmograph'
// Exmaple data
const pointsData = [
{ id: '1', color: 'red', value: 10 },
{ id: '2', color: 'blue', value: 20 },
]
const linksData = [
{ source: '1', target: '2', color: 'green', value: 5 },
]
// Exmaple configuration
const config = {
points: {
pointIdBy: 'id',
pointColorBy: 'color',
pointSizeBy: 'value',
outputFilename: 'custom-points-filename',
},
links: {
linkSourceBy: 'source',
linkTargetsBy: ['target'],
linkColorBy: 'color',
linkWidthBy: 'value',
outputFilename: 'custom-links-filename',
},
}
// downloadCosmographData: Prepares data and downloads files and names them according to the `outputFilename` in configuration
downloadCosmographData(config, pointsData, linksData)
.then(({cosmographConfig, pointsSummary, linksSummary}) => {
console.log('Cosmograph config:', cosmographConfig)
console.log('Points data summary:', pointsSummary)
console.log('Links data summary:', linksSummary)
})
.catch((error) => {
console.error('Error:', error)
})
// prepareCosmographData: Prepares data to an Arrow table
prepareCosmographData(config, pointsData, linksData)
.then((result) => {
if (result) {
const { points, links, cosmographConfig, pointsSummary, linksSummary } = result
console.log('Arrow points:', points)
console.log('Arrow links:', links)
console.log('Cosmograph config:', cosmographConfig)
console.log('Points data summary:', pointsSummary)
console.log('Links data summary:', linksSummary)
}
})
.catch((error) => {
console.error('Error:', error)
})
// prepareCosmographDataFiles: Prepares data as blobs
prepareCosmographDataFiles(config, pointsData, linksData)
.then((result) => {
if (result) {
const { points, links, cosmographConfig, pointsSummary, linksSummary } = result
console.log('Blob points:', points)
console.log('Blob links:', links)
console.log('Cosmograph config:', cosmographConfig)
console.log('Points data summary:', pointsSummary)
console.log('Links data summary:', linksSummary)
}
})
.catch((error) => {
console.error('Error:', error)
})Prepare data and upload it into Cosmograph
Prepare data with configuration and upload it into Cosmograph using prepareCosmographData.
import React, { useState } from 'react'
import { CosmographProvider, Cosmograph } from '@cosmograph/react'
import { prepareCosmographData } from '@cosmograph/cosmograph'
const ReactExample = (): JSX.Element => {
const [config, setConfig] = useState({
// you can add some initial Cosmograph configuration here like simulation settings
})
const [files, setFiles] = useState<{ pointsFile: File | null, linksFile: File | null }>({ pointsFile: null, linksFile: null })
const handleFileChange = (type: 'pointsFile' | 'linksFile') => async (event: React.ChangeEvent<HTMLInputElement>): Promise<void> => {
const file = event.target.files?.[0]
if (file) {
setFiles(prevFiles => {
const updatedFiles = { ...prevFiles, [type]: file }
prepareAndSetConfig(updatedFiles.pointsFile, updatedFiles.linksFile)
return updatedFiles
})
}
}
const prepareAndSetConfig = async (pointsFile: File | null, linksFile: File | null): Promise<void> => {
if (pointsFile) {
const dataPrepConfig = {
points: {
pointIdBy: 'id',
pointColorBy: 'color',
pointSizeBy: 'value',
},
links: {
linkSourceBy: 'source',
linkTargetsBy: ['target'],
linkColorBy: 'color',
linkWidthBy: 'value',
},
}
const result = await prepareCosmographData(dataPrepConfig, pointsFile, linksFile)
if (result) {
const { points, links, cosmographConfig } = result
setConfig({ points, links, ...cosmographConfig })
}
}
}
return (
<CosmographProvider>
<Cosmograph {...config} />
<input type="file" accept=".csv,.arrow,.parquet,.json" onChange={handleFileChange('pointsFile')} />
<input type="file" accept=".csv,.arrow,.parquet,.json" onChange={handleFileChange('linksFile')} />
</CosmographProvider>
)
}
export default ReactExample