API Data Fetcher and Organizer
Fetch data from APIs and save as structured files
Overview
Many services provide API interfaces for programmatic data access. Claude can help you write scripts to fetch data from APIs, handle pagination and error retries, and organize results into tables or reports.
Use Cases
- Fetching social media statistics
- Extracting e-commerce platform order information
- Collecting public data like weather and stocks
- Syncing data from third-party systems
Steps
Step 1: Understand API Structure
First test the API call to understand the return format.
I want to get repository information from GitHub API:
API: https://api.github.com/repos/user/repo
Request type: GET
Token required: ghp_xxxxx
Please help me:
1. Test if this API is accessible
2. Show the returned JSON structure
3. Identify the fields I need: name, stars, forks, open_issues
Step 2: Handle Paginated Data
For APIs that return large amounts of data, pagination is needed.
The API uses pagination, returning 100 records per page:
- First page: https://api.example.com/data?page=1
- Get next page via Link header or next_page field
Please create a script to:
1. Automatically fetch data from all pages
2. Merge into one complete dataset
3. Display fetch progress
4. Save to ~/data/api_results.json
Step 3: Error Handling and Retries
Add fault tolerance to handle network issues.
Please enhance script reliability:
- If request fails, automatically retry 3 times
- Wait 5 seconds between retries
- If encountering 429 (rate limit), wait 60 seconds before retrying
- Log all failed requests to a log file
- Support resumption (if interrupted, continue from last position next time)
Step 4: Data Transformation
Convert API returned JSON to a more usable format.
Please process the fetched data:
1. Extract needed fields, ignore irrelevant fields
2. Flatten nested structures (e.g., user.name -> user_name)
3. Format datetime to standard format
4. Convert to CSV file: ~/data/api_data.csv
5. Generate data dictionary explaining each field's meaning
Step 5: Scheduled Auto-fetch
Create scheduled tasks to automatically update data.
Please create an automation script:
1. Encapsulate the above fetch logic into a Python script
2. Add command line argument support (e.g., specify date range)
3. Generate timestamped files for each run
4. Set up cron job to run automatically at 2 AM daily
5. Send notification email if error occurs
Save as ~/scripts/api_fetcher.py
Tips
Pay attention to API rate limits. Too frequent requests may result in being blocked. It's recommended to add request intervals and cache data that doesn't change often.
Store API keys in environment variables or config files, don't hardcode them in scripts to avoid leakage. Use .env files to manage keys and add them to .gitignore.
Common Questions
Q: What if the API returns too much data and memory is insufficient? A: Use streaming processing, writing to files while fetching, don't load all data into memory at once. Or process in batches, only fetching a portion of data at a time.
Q: How to handle API version upgrades? A: Specify API version number in the script (usually in URL or Header). Monitor API documentation change notifications and update scripts promptly.
Q: Can multiple APIs be called simultaneously? A: Yes. Claude can create scripts to call multiple APIs concurrently, or fetch and merge data from multiple sources. Pay attention to handling authentication and format differences between different APIs.