Tutorial#

This tutorial walks you through your first conda download statistics query. By the end, you will know how to use both the command-line tool and the Python API.

Note

Install condastats before you begin (see Installation). Verify it works by running condastats --version.

Step 1: Get total downloads for a package#

The simplest query returns the total download count for a package since 2017:

$ condastats overall pandas
>>> from condastats import overall
>>> overall("pandas")
pkg_name
pandas    24086379
Name: counts, dtype: int64

This sums all downloads across every month, platform, and channel in the Anaconda public dataset.

Step 2: Focus on a specific month#

Add --month to restrict the query to a single month:

$ condastats overall pandas --month 2024-01
>>> overall("pandas", month="2024-01")
pkg_name
pandas    932443
Name: counts, dtype: int64

Step 3: See the monthly trend#

Provide a range with --start_month and --end_month, then pass --monthly to get the per-month breakdown:

$ condastats overall pandas --start_month 2024-01 --end_month 2024-03 --monthly
>>> overall("pandas", start_month="2024-01", end_month="2024-03", monthly=True)
pkg_name  time
pandas    2024-01     932443.0
          2024-02    1049595.0
          2024-03    1268802.0
Name: counts, dtype: float64

Step 4: Break down by a dimension#

Use a different subcommand to group downloads by platform, data source, package version, or Python version:

$ condastats pkg_platform pandas --month 2024-01
>>> from condastats import pkg_platform
>>> pkg_platform("pandas", month="2024-01")
pkg_name  pkg_platform
pandas    linux-32              22.0
          linux-64          461318.0
          linux-aarch64      19375.0
          osx-64            131849.0
          osx-arm64          68324.0
          win-32               155.0
          win-64            251400.0
Name: counts, dtype: float64

Step 5: Compare multiple packages#

Both the CLI and the Python API accept multiple package names:

$ condastats overall pandas numpy dask --month 2024-01
>>> overall(["pandas", "numpy", "dask"], month="2024-01")
pkg_name
dask      221200
numpy    3345821
pandas    932443
Name: counts, dtype: int64

Tip

Every function returns a pandas.Series (or pandas.DataFrame with complete=True), so you can immediately chain pandas operations on the result.

What next?#

How-to guides

Practical recipes for filtering, grouping, Jupyter, and more.

How-to guides
API reference

Full documentation for every function and parameter.

Python API
CLI reference

All subcommands, options, and exit codes.

Command-line interface
Explanation

How the data source and query pipeline work.

Explanation