Tutorial#
This tutorial walks you through your first conda download statistics query. By the end, you will know how to use both the command-line tool and the Python API.
Note
Install condastats before you begin (see Installation). Verify it
works by running condastats --version.
Step 1: Get total downloads for a package#
The simplest query returns the total download count for a package since 2017:
$ condastats overall pandas
>>> from condastats import overall
>>> overall("pandas")
pkg_name
pandas 24086379
Name: counts, dtype: int64
This sums all downloads across every month, platform, and channel in the Anaconda public dataset.
Step 2: Focus on a specific month#
Add --month to restrict the query to a single month:
$ condastats overall pandas --month 2024-01
>>> overall("pandas", month="2024-01")
pkg_name
pandas 932443
Name: counts, dtype: int64
Step 3: See the monthly trend#
Provide a range with --start_month and --end_month, then pass
--monthly to get the per-month breakdown:
$ condastats overall pandas --start_month 2024-01 --end_month 2024-03 --monthly
>>> overall("pandas", start_month="2024-01", end_month="2024-03", monthly=True)
pkg_name time
pandas 2024-01 932443.0
2024-02 1049595.0
2024-03 1268802.0
Name: counts, dtype: float64
Step 4: Break down by a dimension#
Use a different subcommand to group downloads by platform, data source, package version, or Python version:
$ condastats pkg_platform pandas --month 2024-01
>>> from condastats import pkg_platform
>>> pkg_platform("pandas", month="2024-01")
pkg_name pkg_platform
pandas linux-32 22.0
linux-64 461318.0
linux-aarch64 19375.0
osx-64 131849.0
osx-arm64 68324.0
win-32 155.0
win-64 251400.0
Name: counts, dtype: float64
Step 5: Compare multiple packages#
Both the CLI and the Python API accept multiple package names:
$ condastats overall pandas numpy dask --month 2024-01
>>> overall(["pandas", "numpy", "dask"], month="2024-01")
pkg_name
dask 221200
numpy 3345821
pandas 932443
Name: counts, dtype: int64
Tip
Every function returns a pandas.Series (or
pandas.DataFrame with complete=True), so you can
immediately chain pandas operations on the result.
What next?#
Practical recipes for filtering, grouping, Jupyter, and more.
Full documentation for every function and parameter.
All subcommands, options, and exit codes.
How the data source and query pipeline work.