Cosmograph v1 to v2 Migration Guide
Cosmograph now uses Apache Arrow format for efficient data storage and DuckDB-Wasm for rapid data operations. These changes have significantly boosted performance and reduced memory footprint, allowing Cosmograph to easily handle rendering of tens of millions of points.
Hovever, this update requires specific configurations for points and links data. Below you will find a description and examples with the basic configuration options Cosmograph now requires.
Data and configuration
As in the v1, Cosmograph needs at least points to render. If links data is provided and valid, they will be rendered as well.
In this section, weāll focus on the minimal required configuration options for both points and links. For a comprehensive list of all available properties, refer to the CosmographConfig documentation.
Input data formats
Cosmograph v2 significantly expands data format support to streamline your workflow. You can now work with CSV, JSON, Apache Parquet, Apache Arrow files, or even direct URLs pointing to your data. Whether youāre working with a simple CSV file, binary Arrow data, or connecting to an external DuckDB instance, Cosmograph handles the data ingestion seamlessly.
For a complete overview of supported formats and usage examples, see our data formats guide.
Points configuration
The minimal required configuration options for points data to render it is:
points: The points data.pointIdBy: Unique identifier column for each point.pointIndexBy: Ordinal index column of each point from 0 to x (unique points count). This index is used for efficient lookup and referencing.
You can find full list of points properties here.
Links configuration
The minimal required configuration options for links data is:
links: The links data.linkSourceBy: Unique identifier column that containspointIdByof the source point of the link.linkSourceIndexBy: The index column of the source point of the link. This corresponds to thepointIndexByof the point identified bylinkSourceBy.linkTargetBy: Unique identifier column that containspointIdByof the target point of the link.linkTargetIndexBy: The index column of the target point of the link. This corresponds to thepointIndexByof the point identified bylinkTargetBy.
You can find full list of link properties here.
Limitations
Itās important to note that if the required indices (pointIndexBy, linkSourceIndexBy, and linkTargetIndexBy) are not provided, Cosmograph wonāt be able to render your data. Additionally, if you have multiple targets for each link, youāll need to adjust your data to fit the new format.
The key differences are the introduction of pointIndexBy, linkSourceIndexBy, and linkTargetIndexBy properties, which optimize the referencing of source and target points in a link. Instead of comparing the unique identifiers, Cosmograph now uses the indices of the points for faster lookups and comparisons. This optimization enables Cosmograph to handle larger datasets with improved performance and reduced memory overhead, resulting in a more responsive and performant visualization experience. Thatās why in v2 youāll need to provide indexes in the input data for Cosmograph to work properly.
Another limitation is that in v2 you can provide only a single target for each link using the linkTargetBy property. If you want to include multiple targets for links in Cosmograph, youāll need to modify your links data to include all targets into one column.
However, thereās good news! Weāve created a tool that will help you easily handle these data preparation tasks.
Cosmograph Data Kit
Cosmograph Data Kit is a set of helper functions that prepare data for Cosmograph v2. It simplifies the migration process and helps avoid confusion in data configuration for Cosmograph v2.
These functions prepares your data into formats that Cosmograph recognizes and generates all necessary indexes if your data doesnāt have them. Old data formats are still supported, but youāll need to process them through our Cosmograph Data Kit. These functions also generate a ready-to-use configuration for Cosmograph, tailored specifically to your data.
Learn how to use it here.
Upload data into Cosmograph
Below are ready-to-use examples demonstrating how to upload data into Cosmograph with basic configuration.
Note that
pointColorBy,pointSizeBy,linkColorBy, andlinkWidthByproperties are optional. They are included in these examples for demonstration purposes only and are not required in the data/configuration.
import React, { useState, useEffect } from 'react'
import { CosmographProvider, Cosmograph } from '@cosmograph/react'
const ReactCosmographExample = () => {
const [data, setData] = useState<{ points?: File; links?: File }>()
const [config, setConfig] = useState({
pointIdBy: 'id',
pointIndexBy: 'idx',
pointColorBy: 'color',
pointSizeBy: 'value',
linkSourceBy: 'source',
linkSourceIndexBy: 'sourceidx',
linkTargetBy: 'target',
linkTargetIndexBy: 'targetidx',
linkColorBy: 'color',
linkWidthBy: 'value',
})
const handlePointsFileChange = (event: React.ChangeEvent<HTMLInputElement>): void => {
const file = event.target.files?.[0]
if (file) {
setData((prevData) => ({ ...prevData, points: file }))
}
}
const handleLinksFileChange = (event: React.ChangeEvent<HTMLInputElement>): void => {
const file = event.target.files?.[0]
if (file) {
setData((prevData) => ({ ...prevData, links: file }))
}
}
return (
<div>
<Cosmograph {...config} points={data?.points} links={data?.links} />
<input type="file" onChange={handlePointsFileChange} />
<input type="file" onChange={handleLinksFileChange} />
</div>
)
}