Python API#
All public functions are available directly from the top-level
condastats package:
from condastats import overall, pkg_platform, data_source, pkg_version, pkg_python
Every function returns a pandas.Series (download counts indexed by
package name and, optionally, by grouping dimension and/or time). The
overall() function can also return a pandas.DataFrame when
complete=True.
S3-backed functions#
These convenience functions read data from the public Anaconda S3 bucket
via dask and return aggregated pandas results. They require dask
and s3fs to be installed.
- condastats.overall(package, month=None, start_month=None, end_month=None, monthly=False, complete=False, pkg_platform=None, data_source=None, pkg_version=None, pkg_python=None)[source]#
Get overall download counts for one or more conda packages.
Parameters#
- packagestr or list of str
Package name(s) to query.
- monthstr or datetime, optional
Specific month in YYYY-MM format.
- start_monthstr or datetime, optional
Start of date range in YYYY-MM format. Must be used with
end_month.- end_monthstr or datetime, optional
End of date range in YYYY-MM format. Must be used with
start_month.- monthlybool, default False
If True, return monthly breakdown instead of totals.
- completebool, default False
If True, return the full DataFrame without aggregation.
- pkg_platformstr, optional
Filter by platform (e.g., ‘linux-64’, ‘osx-64’, ‘win-64’).
- data_sourcestr, optional
Filter by data source (e.g., ‘anaconda’, ‘conda-forge’).
- pkg_versionstr, optional
Filter by package version.
- pkg_pythonstr or float, optional
Filter by Python version (e.g., ‘3.7’ or 3.7).
Returns#
- pandas.Series or pandas.DataFrame
Download counts, either as a Series (aggregated) or DataFrame (complete).
- Parameters:
- Return type:
- condastats.pkg_platform(package, month=None, start_month=None, end_month=None, monthly=False)[source]#
Get download counts grouped by platform.
Parameters#
- packagestr or list of str
Package name(s) to query.
- monthstr or datetime, optional
Specific month in YYYY-MM format.
- start_monthstr or datetime, optional
Start of date range in YYYY-MM format.
- end_monthstr or datetime, optional
End of date range in YYYY-MM format.
- monthlybool, default False
If True, return monthly breakdown.
Returns#
- pandas.Series
Download counts grouped by platform.
- condastats.data_source(package, month=None, start_month=None, end_month=None, monthly=False)[source]#
Get download counts grouped by data source.
Parameters#
- packagestr or list of str
Package name(s) to query.
- monthstr or datetime, optional
Specific month in YYYY-MM format.
- start_monthstr or datetime, optional
Start of date range in YYYY-MM format.
- end_monthstr or datetime, optional
End of date range in YYYY-MM format.
- monthlybool, default False
If True, return monthly breakdown.
Returns#
- pandas.Series
Download counts grouped by data source.
- condastats.pkg_version(package, month=None, start_month=None, end_month=None, monthly=False)[source]#
Get download counts grouped by package version.
Parameters#
- packagestr or list of str
Package name(s) to query.
- monthstr or datetime, optional
Specific month in YYYY-MM format.
- start_monthstr or datetime, optional
Start of date range in YYYY-MM format.
- end_monthstr or datetime, optional
End of date range in YYYY-MM format.
- monthlybool, default False
If True, return monthly breakdown.
Returns#
- pandas.Series
Download counts grouped by package version.
- condastats.pkg_python(package, month=None, start_month=None, end_month=None, monthly=False)[source]#
Get download counts grouped by Python version.
Parameters#
- packagestr or list of str
Package name(s) to query.
- monthstr or datetime, optional
Specific month in YYYY-MM format.
- start_monthstr or datetime, optional
Start of date range in YYYY-MM format.
- end_monthstr or datetime, optional
End of date range in YYYY-MM format.
- monthlybool, default False
If True, return monthly breakdown.
Returns#
- pandas.Series
Download counts grouped by Python version.
Pure-pandas query functions#
These functions operate on any pandas.DataFrame that follows the
Anaconda package-data schema (columns: pkg_name, counts, time,
pkg_platform, data_source, pkg_version, pkg_python).
They have no dependency on dask or s3fs and work anywhere pandas runs, including Pyodide.
- condastats.query_overall(df, package=None, monthly=False, complete=False, pkg_platform=None, data_source=None, pkg_version=None, pkg_python=None)[source]#
Get overall download counts from a pandas DataFrame.
Parameters#
- dfpandas.DataFrame
DataFrame with at least
pkg_nameandcountscolumns.- packagestr or list of str, optional
Package name(s) to filter by. If None, all packages are included.
- monthlybool, default False
If True, return monthly breakdown instead of totals.
- completebool, default False
If True, return the full filtered DataFrame without aggregation.
- pkg_platformstr, optional
Filter by platform (e.g., ‘linux-64’, ‘osx-64’, ‘win-64’).
- data_sourcestr, optional
Filter by data source (e.g., ‘anaconda’, ‘conda-forge’).
- pkg_versionstr, optional
Filter by package version.
- pkg_pythonstr or float, optional
Filter by Python version (e.g., ‘3.7’ or 3.7).
Returns#
- pandas.Series or pandas.DataFrame
Download counts, either as a Series (aggregated) or DataFrame (complete).
- condastats.query_grouped(df, column, package=None, monthly=False)[source]#
Get download counts grouped by a given dimension.
Parameters#
- dfpandas.DataFrame
DataFrame with
pkg_name,counts, and column columns.- columnstr
Column name to group by (e.g.,
'pkg_platform','data_source').- packagestr or list of str, optional
Package name(s) to filter by. If None, all packages are included.
- monthlybool, default False
If True, include a monthly breakdown.
Returns#
- pandas.Series
Aggregated download counts.
Common parameters#
The S3-backed functions share a core set of parameters:
packageOne or more package names. Pass a string for a single package or a list of strings for multiple packages.
monthA specific month in
YYYY-MMformat (e.g.,"2024-01"). Mutually exclusive withstart_month/end_month.start_month/end_monthDefine a date range. Both must be provided together, in
YYYY-MMformat.monthlyWhen
True, return a per-month breakdown instead of a single total. Adds atimelevel to the result index.
Return types#
Scenario |
Return type |
|---|---|
Default (aggregated) |
|
|
|