diff --git a/docs/01_introduction/01_overview.mdx b/docs/01_introduction/01_overview.mdx new file mode 100644 index 00000000..9d9bb55b --- /dev/null +++ b/docs/01_introduction/01_overview.mdx @@ -0,0 +1,26 @@ +--- +id: overview +title: Overview +description: 'The official Python library for accessing the Apify API, providing synchronous and asynchronous interfaces for Actors, datasets, and storage.' +--- + +## Introduction + +The [Apify client for Python](https://github.com/apify/apify-client-python) is the official library to access the [Apify REST API](/api/v2) from your Python applications. It provides useful features like automatic retries and convenience functions that improve the experience of using the Apify API. + +Key features: + +- Synchronous and asynchronous interfaces for flexible integration +- Automatic retries for improved reliability +- JSON encoding with UTF-8 for all requests and responses +- Comprehensive API coverage for [Actors](/platform/actors), [datasets](/platform/storage/dataset), [key-value stores](/platform/storage/key-value-store), and more + +## Next steps + +Now that you're familiar with the basics, explore more advanced features: + +- [Asyncio support](/concepts/asyncio-support) - Learn about asynchronous programming with the client +- Common use-case examples like: + - [Passing an input to Actor](/api/client/python/docs/examples/passing-input-to-actor) + - [Retrieve Actor data](/api/client/python/docs/examples/retrieve-actor-data) +- [API Reference](/api/client/python/reference) - Browse the complete API documentation diff --git a/docs/01_introduction/02_installation.mdx b/docs/01_introduction/02_installation.mdx new file mode 100644 index 00000000..8f7bb8e4 --- /dev/null +++ b/docs/01_introduction/02_installation.mdx @@ -0,0 +1,39 @@ +--- +id: installation +title: Installation +description: 'How to install the Apify client for Python and verify the installation.' +--- + +## Prerequisites + +Before installing the Apify client, ensure your system meets the following requirements: + +- _An Apify account_ +- _Python 3.10 or higher_: You can download Python from the [official website](https://www.python.org/downloads/). +- _Python package manager_: While this guide uses [pip](https://pip.pypa.io/en/stable/), you can also use any package manager you want. + +To verify that Python and pip are installed, run the following commands: + +```sh +python --version +``` + +```sh +pip --version +``` + +If these commands return the respective versions, you're ready to continue. + +## Installation + +The Apify client is available as the [`apify-client`](https://pypi.org/project/apify-client/) package on PyPI. To install it, run: + +```sh +pip install apify-client +``` + +After installation, verify that the client is installed correctly by checking its version: + +```sh +python -c 'import apify_client; print(apify_client.__version__)' +``` diff --git a/docs/01_overview/index.mdx b/docs/01_introduction/03_quick_start.mdx similarity index 63% rename from docs/01_overview/index.mdx rename to docs/01_introduction/03_quick_start.mdx index a47d1790..070fc3f5 100644 --- a/docs/01_overview/index.mdx +++ b/docs/01_introduction/03_quick_start.mdx @@ -1,6 +1,11 @@ --- -id: overview -title: Overview +id: quick-start +title: Quick start +description: 'Get started with the Apify client for Python by running your first Actor and retrieving results.' +--- + +Learn how to start Actors and retrieve their results using the Apify Client. + --- import Tabs from '@theme/Tabs'; @@ -16,52 +21,7 @@ import InputSyncExample from '!!raw-loader!./code/03_input_sync.py'; import DatasetAsyncExample from '!!raw-loader!./code/03_dataset_async.py'; import DatasetSyncExample from '!!raw-loader!./code/03_dataset_sync.py'; -## Introduction - -The [Apify client for Python](https://github.com/apify/apify-client-python) is the official library to access the [Apify REST API](/api/v2) from your Python applications. It provides useful features like automatic retries and convenience functions that improve the experience of using the Apify API. - -Key features: - -- Synchronous and asynchronous interfaces for flexible integration -- Automatic retries for improved reliability -- JSON encoding with UTF-8 for all requests and responses -- Comprehensive API coverage for [Actors](/platform/actors), [datasets](/platform/storage/dataset), [key-value stores](/platform/storage/key-value-store), and more - -## Prerequisites - -Before installing the Apify client, ensure your system meets the following requirements: - -- _An Apify account_ -- _Python 3.10 or higher_: You can download Python from the [official website](https://www.python.org/downloads/). -- _Python package manager_: While this guide uses [pip](https://pip.pypa.io/en/stable/), you can also use any package manager you want. - -To verify that Python and pip are installed, run the following commands: - -```sh -python --version -``` - -```sh -pip --version -``` - -If these commands return the respective versions, you're ready to continue. - -## Installation - -The Apify client is available as the [`apify-client`](https://pypi.org/project/apify-client/) package on PyPI. To install it, run: - -```sh -pip install apify-client -``` - -After installation, verify that the client is installed correctly by checking its version: - -```sh -python -c 'import apify_client; print(apify_client.__version__)' -``` - -## Authentication and initialization +## Step 1: Authentication and initialization To use the client, you need an [API token](/platform/integrations/api#api-token). You can find your token under the [Integrations](https://console.apify.com/account/integrations) tab in Apify Console. Copy the token and initialize the client by providing it as a parameter to the `ApifyClient` constructor. @@ -84,11 +44,7 @@ The API token is used to authorize your requests to the Apify API. You can be ch ::: -## Quick start - -Now that you have the client set up, let's explore how to run Actors on the Apify platform, provide input to them, and retrieve their results. - -### Running your first Actor +## Step 2: Running your first Actor To start an Actor, you need its ID (e.g., `john-doe/my-cool-actor`) and an API token. The Actor's ID is a combination of the Actor name and the Actor owner's username. Use the [`ActorClient`](/reference/class/ActorClient) to run the Actor and wait for it to complete. You can run both your own Actors and [Actors from Apify Store](https://docs.apify.com/platform/actors/running/actors-in-store). @@ -105,7 +61,7 @@ To start an Actor, you need its ID (e.g., `john-doe/my-cool-actor`) and an API t -### Providing input to Actor +### Passing input to the Actor Actors often require input, such as URLs to scrape, search terms, or other configuration data. You can pass input as a JSON object when starting the Actor using the [`ActorClient.call`](/reference/class/ActorClient#call) method. Actors respect the input schema defined in the Actor's [input schema](https://docs.apify.com/platform/actors/development/actor-definition/input-schema). @@ -122,7 +78,7 @@ Actors often require input, such as URLs to scrape, search terms, or other confi -### Getting results from the dataset +## Step 3: Getting results from the dataset To get the results from the dataset, you can use the [`DatasetClient`](/reference/class/DatasetClient) ([`ApifyClient.dataset`](/reference/class/ApifyClient#dataset)) and [`DatasetClient.list_items`](/reference/class/DatasetClient#list_items) method. You need to pass the dataset ID to define which dataset you want to access. You can get the dataset ID from the Actor's run dictionary (represented by `defaultDatasetId`). @@ -144,13 +100,3 @@ To get the results from the dataset, you can use the [`DatasetClient`](/referenc Running an Actor might take time, depending on the Actor's complexity and the amount of data it processes. If you want only to get data and have an immediate response, you should access the existing dataset of the finished [Actor run](https://docs.apify.com/platform/actors/running/runs-and-builds#runs). ::: - -## Next steps - -Now that you're familiar with the basics, explore more advanced features: - -- [Asyncio support](/concepts/asyncio-support) - Learn about asynchronous programming with the client -- Common use-case examples like: - - [Passing an input to Actor](/api/client/python/docs/examples/passing-input-to-actor) - - [Retrieve Actor data](/api/client/python/docs/examples/retrieve-actor-data) -- [API Reference](/api/client/python/reference) - Browse the complete API documentation diff --git a/docs/01_overview/code/01_usage_async.py b/docs/01_introduction/code/01_usage_async.py similarity index 100% rename from docs/01_overview/code/01_usage_async.py rename to docs/01_introduction/code/01_usage_async.py diff --git a/docs/01_overview/code/01_usage_sync.py b/docs/01_introduction/code/01_usage_sync.py similarity index 100% rename from docs/01_overview/code/01_usage_sync.py rename to docs/01_introduction/code/01_usage_sync.py diff --git a/docs/01_overview/code/02_auth_async.py b/docs/01_introduction/code/02_auth_async.py similarity index 100% rename from docs/01_overview/code/02_auth_async.py rename to docs/01_introduction/code/02_auth_async.py diff --git a/docs/01_overview/code/02_auth_sync.py b/docs/01_introduction/code/02_auth_sync.py similarity index 100% rename from docs/01_overview/code/02_auth_sync.py rename to docs/01_introduction/code/02_auth_sync.py diff --git a/docs/01_overview/code/03_dataset_async.py b/docs/01_introduction/code/03_dataset_async.py similarity index 100% rename from docs/01_overview/code/03_dataset_async.py rename to docs/01_introduction/code/03_dataset_async.py diff --git a/docs/01_overview/code/03_dataset_sync.py b/docs/01_introduction/code/03_dataset_sync.py similarity index 100% rename from docs/01_overview/code/03_dataset_sync.py rename to docs/01_introduction/code/03_dataset_sync.py diff --git a/docs/01_overview/code/03_input_async.py b/docs/01_introduction/code/03_input_async.py similarity index 100% rename from docs/01_overview/code/03_input_async.py rename to docs/01_introduction/code/03_input_async.py diff --git a/docs/01_overview/code/03_input_sync.py b/docs/01_introduction/code/03_input_sync.py similarity index 100% rename from docs/01_overview/code/03_input_sync.py rename to docs/01_introduction/code/03_input_sync.py diff --git a/docs/02_concepts/02_single_collection_clients.mdx b/docs/02_concepts/02_client_architecture.mdx similarity index 100% rename from docs/02_concepts/02_single_collection_clients.mdx rename to docs/02_concepts/02_client_architecture.mdx diff --git a/docs/03_examples/01_passing_input_to_actor.mdx b/docs/03_guides/01_passing_input_to_actors.mdx similarity index 100% rename from docs/03_examples/01_passing_input_to_actor.mdx rename to docs/03_guides/01_passing_input_to_actors.mdx diff --git a/docs/03_examples/02_manage_tasks_for_reusable_input.mdx b/docs/03_guides/02_managing_tasks.mdx similarity index 100% rename from docs/03_examples/02_manage_tasks_for_reusable_input.mdx rename to docs/03_guides/02_managing_tasks.mdx diff --git a/docs/03_examples/03_retrieve_actor_data.mdx b/docs/03_guides/03_retrieving_data.mdx similarity index 100% rename from docs/03_examples/03_retrieve_actor_data.mdx rename to docs/03_guides/03_retrieving_data.mdx diff --git a/docs/02_concepts/08_pagination.mdx b/docs/03_guides/04_pagination.mdx similarity index 100% rename from docs/02_concepts/08_pagination.mdx rename to docs/03_guides/04_pagination.mdx diff --git a/docs/02_concepts/09_streaming.mdx b/docs/03_guides/05_streaming_resources.mdx similarity index 100% rename from docs/02_concepts/09_streaming.mdx rename to docs/03_guides/05_streaming_resources.mdx diff --git a/docs/02_concepts/07_convenience_methods.mdx b/docs/03_guides/06_convenience_methods.mdx similarity index 100% rename from docs/02_concepts/07_convenience_methods.mdx rename to docs/03_guides/06_convenience_methods.mdx diff --git a/docs/03_examples/04_integration_with_data_libraries.mdx b/docs/03_guides/07_integrating_with_pandas.mdx similarity index 100% rename from docs/03_examples/04_integration_with_data_libraries.mdx rename to docs/03_guides/07_integrating_with_pandas.mdx diff --git a/docs/03_examples/code/01_input_async.py b/docs/03_guides/code/01_input_async.py similarity index 100% rename from docs/03_examples/code/01_input_async.py rename to docs/03_guides/code/01_input_async.py diff --git a/docs/03_examples/code/01_input_sync.py b/docs/03_guides/code/01_input_sync.py similarity index 100% rename from docs/03_examples/code/01_input_sync.py rename to docs/03_guides/code/01_input_sync.py diff --git a/docs/03_examples/code/02_tasks_async.py b/docs/03_guides/code/02_tasks_async.py similarity index 100% rename from docs/03_examples/code/02_tasks_async.py rename to docs/03_guides/code/02_tasks_async.py diff --git a/docs/03_examples/code/02_tasks_sync.py b/docs/03_guides/code/02_tasks_sync.py similarity index 100% rename from docs/03_examples/code/02_tasks_sync.py rename to docs/03_guides/code/02_tasks_sync.py diff --git a/docs/03_examples/code/03_retrieve_async.py b/docs/03_guides/code/03_retrieve_async.py similarity index 100% rename from docs/03_examples/code/03_retrieve_async.py rename to docs/03_guides/code/03_retrieve_async.py diff --git a/docs/03_examples/code/03_retrieve_sync.py b/docs/03_guides/code/03_retrieve_sync.py similarity index 100% rename from docs/03_examples/code/03_retrieve_sync.py rename to docs/03_guides/code/03_retrieve_sync.py diff --git a/docs/03_examples/code/04_pandas_async.py b/docs/03_guides/code/04_pandas_async.py similarity index 100% rename from docs/03_examples/code/04_pandas_async.py rename to docs/03_guides/code/04_pandas_async.py diff --git a/docs/03_examples/code/04_pandas_sync.py b/docs/03_guides/code/04_pandas_sync.py similarity index 100% rename from docs/03_examples/code/04_pandas_sync.py rename to docs/03_guides/code/04_pandas_sync.py diff --git a/docs/03_guides/code/07_call_async.py b/docs/03_guides/code/07_call_async.py new file mode 100644 index 00000000..955f7532 --- /dev/null +++ b/docs/03_guides/code/07_call_async.py @@ -0,0 +1,14 @@ +from apify_client import ApifyClientAsync + +TOKEN = 'MY-APIFY-TOKEN' + + +async def main() -> None: + apify_client = ApifyClientAsync(TOKEN) + actor_client = apify_client.actor('username/actor-name') + + # Start an Actor and waits for it to finish + finished_actor_run = await actor_client.call() + + # Starts an Actor and waits maximum 60s (1 minute) for the finish + actor_run = await actor_client.start(wait_for_finish=60) diff --git a/docs/03_guides/code/07_call_sync.py b/docs/03_guides/code/07_call_sync.py new file mode 100644 index 00000000..af790eaa --- /dev/null +++ b/docs/03_guides/code/07_call_sync.py @@ -0,0 +1,14 @@ +from apify_client import ApifyClient + +TOKEN = 'MY-APIFY-TOKEN' + + +def main() -> None: + apify_client = ApifyClient(TOKEN) + actor_client = apify_client.actor('username/actor-name') + + # Start an Actor and waits for it to finish + finished_actor_run = actor_client.call() + + # Starts an Actor and waits maximum 60s (1 minute) for the finish + actor_run = actor_client.start(wait_for_finish=60) diff --git a/docs/03_guides/code/08_pagination_async.py b/docs/03_guides/code/08_pagination_async.py new file mode 100644 index 00000000..50e9d047 --- /dev/null +++ b/docs/03_guides/code/08_pagination_async.py @@ -0,0 +1,35 @@ +from apify_client import ApifyClientAsync + +TOKEN = 'MY-APIFY-TOKEN' + + +async def main() -> None: + apify_client = ApifyClientAsync(TOKEN) + + # Initialize the dataset client + dataset_client = apify_client.dataset('dataset-id') + + # Define the pagination parameters + limit = 1000 # Number of items per page + offset = 0 # Starting offset + all_items = [] # List to store all fetched items + + while True: + # Fetch a page of items + response = await dataset_client.list_items(limit=limit, offset=offset) + items = response.items + total = response.total + + print(f'Fetched {len(items)} items') + + # Add the fetched items to the complete list + all_items.extend(items) + + # Exit the loop if there are no more items to fetch + if offset + limit >= total: + break + + # Increment the offset for the next page + offset += limit + + print(f'Overall fetched {len(all_items)} items') diff --git a/docs/03_guides/code/08_pagination_sync.py b/docs/03_guides/code/08_pagination_sync.py new file mode 100644 index 00000000..3beb4fbe --- /dev/null +++ b/docs/03_guides/code/08_pagination_sync.py @@ -0,0 +1,35 @@ +from apify_client import ApifyClient + +TOKEN = 'MY-APIFY-TOKEN' + + +def main() -> None: + apify_client = ApifyClient(TOKEN) + + # Initialize the dataset client + dataset_client = apify_client.dataset('dataset-id') + + # Define the pagination parameters + limit = 1000 # Number of items per page + offset = 0 # Starting offset + all_items = [] # List to store all fetched items + + while True: + # Fetch a page of items + response = dataset_client.list_items(limit=limit, offset=offset) + items = response.items + total = response.total + + print(f'Fetched {len(items)} items') + + # Add the fetched items to the complete list + all_items.extend(items) + + # Exit the loop if there are no more items to fetch + if offset + limit >= total: + break + + # Increment the offset for the next page + offset += limit + + print(f'Overall fetched {len(all_items)} items') diff --git a/docs/03_guides/code/09_streaming_async.py b/docs/03_guides/code/09_streaming_async.py new file mode 100644 index 00000000..5459784e --- /dev/null +++ b/docs/03_guides/code/09_streaming_async.py @@ -0,0 +1,14 @@ +from apify_client import ApifyClientAsync + +TOKEN = 'MY-APIFY-TOKEN' + + +async def main() -> None: + apify_client = ApifyClientAsync(TOKEN) + run_client = apify_client.run('MY-RUN-ID') + log_client = run_client.log() + + async with log_client.stream() as log_stream: + if log_stream: + async for bytes_chunk in log_stream.aiter_bytes(): + print(bytes_chunk) diff --git a/docs/03_guides/code/09_streaming_sync.py b/docs/03_guides/code/09_streaming_sync.py new file mode 100644 index 00000000..e7617ab3 --- /dev/null +++ b/docs/03_guides/code/09_streaming_sync.py @@ -0,0 +1,14 @@ +from apify_client import ApifyClient + +TOKEN = 'MY-APIFY-TOKEN' + + +def main() -> None: + apify_client = ApifyClient(TOKEN) + run_client = apify_client.run('MY-RUN-ID') + log_client = run_client.log() + + with log_client.stream() as log_stream: + if log_stream: + for bytes_chunk in log_stream.iter_bytes(): + print(bytes_chunk) diff --git a/website/docusaurus.config.js b/website/docusaurus.config.js index 44c8cc03..6c5a965b 100644 --- a/website/docusaurus.config.js +++ b/website/docusaurus.config.js @@ -45,7 +45,7 @@ module.exports = { title: 'API Client for Python', items: [ { - to: 'docs/overview', + to: 'docs/introduction/overview', label: 'Docs', position: 'left', activeBaseRegex: '/docs(?!/changelog)', diff --git a/website/sidebars.js b/website/sidebars.js index 5e053e71..6338267d 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -1,8 +1,15 @@ module.exports = { sidebar: [ { - type: 'doc', - id: 'overview/overview', + type: 'category', + label: 'Introduction', + collapsed: false, + items: [ + { + type: 'autogenerated', + dirName: '01_introduction', + }, + ], }, { type: 'category', @@ -17,12 +24,12 @@ module.exports = { }, { type: 'category', - label: 'Examples', + label: 'Guides', collapsed: true, items: [ { type: 'autogenerated', - dirName: '03_examples', + dirName: '03_guides', }, ], },