Browse
Data
Skills Found
reading-dss-boundary-data
Specializes in reading HEC-DSS files (V6/V7) to extract boundary condition data for hydraulic modeling. Automates catalog reading and time series extraction, converting DSS data to pandas DataFrames with metadata. Requires Java JVM and pyjnius, uses lazy loading to minimize overhead until first DSS operation.
gtars
A Rust-based toolkit for genomic interval analysis with Python bindings. Handles BED files, overlap detection, coverage tracks, and tokenization for ML. Provides CLI tools and APIs for computational genomics workflows, including fragment processing and reference sequence management.
networkx
This skill provides access to the NetworkX Python library for creating, analyzing, and visualizing network graphs. It includes examples for building graphs, running algorithms like shortest paths and centrality measures, generating synthetic networks, and creating visualizations with matplotlib and interactive libraries.
datacommons-client
Provides a Python client for accessing Data Commons, a unified platform for public statistical data from sources like census bureaus and health organizations. It enables querying population, economic, and environmental data, resolving entity IDs, and exploring the underlying knowledge graph. The documentation includes clear examples for common workflows like fetching time-series data and batch processing.
seaborn
This skill provides comprehensive guidance for using Seaborn to create statistical visualizations in Python. It covers both function and object interfaces, explains when to use different plot types, and includes practical patterns for EDA and publication figures. The documentation addresses common troubleshooting scenarios and integrates well with matplotlib.
exploratory-data-analysis
This skill automates exploratory data analysis for over 200 scientific file formats across chemistry, bioinformatics, microscopy, and other domains. It detects file types, extracts format-specific metadata, assesses data quality, and generates detailed markdown reports with analysis recommendations.
xlsx
This skill wraps the xlsx CLI tool to manipulate Excel files without Python or Node.js. It provides SQL-like filtering, cell editing, CSV conversion, and basic analysis. The documentation includes concrete examples for viewing data, searching patterns, updating cells, and common workflows like data extraction.
mongodb-usage
This skill provides MongoDB best practices documentation for querying and schema design. It covers embedding vs referencing decisions, index strategies with ESR rule, aggregation pipeline optimization, and connection management. The skill is read-only and focuses on performance patterns rather than executing actual queries.
xsv
This skill provides detailed guidance for using xsv, a fast Rust-based CSV toolkit. It covers 20+ commands for selection, filtering, statistics, joining, and sorting operations. The documentation includes practical examples, performance tips, and integration patterns with other Unix tools.
dask
Provides Dask integration for scaling pandas/NumPy workflows beyond memory limits. Includes DataFrames for tabular data, Arrays for numeric operations, and Bags for unstructured data. Covers scheduler selection, chunk optimization, and common patterns like ETL pipelines and iterative algorithms.
pydicom
Provides guidance for using pydicom to read, write, and manipulate DICOM medical imaging files. Covers pixel data extraction, metadata handling, format conversion, compression, and anonymization. Includes installation instructions, code examples for common workflows, and troubleshooting tips for compression issues.
agentdb-optimization
Provides concrete optimization techniques for AgentDB vector databases, including quantization methods (binary, scalar, product) for 4-32x memory reduction, HNSW indexing for 150x faster search, caching strategies, and batch operations. Includes specific performance benchmarks and configuration examples.