This repository contains several one-off or not-often-used scripts used for Dataverse related work.
- Github Issues to CSV - Pull selected github issues into a CSV file
- EZID DOI update/verify - Update EZID target urls for migrated datasets. Verify that the DOIs point to the correct url.
- Basic Stress Test - Run basic browsing scenarios
Use the github API to pull Issues into a CSV file
- Requires virtualenvwrapper
- OS X install:
sudo pip install virtualenvwrapper
- OS X install:
- Open a Terminal
- cd into
src/github_issue_scraper - Make a virtualenv:
mkvirtualenv github_issue_scraper - Install packages (fast):
pip install -r requirements/base.txt - Within
src/github_issue_scraper, copycreds-template.jsontocreds.json(in the same folder) - Change the
creds.jsonsettings appropriately.
- Open a Terminal
- cd into
src/github_issue_scraper - Type
workon github_issue_scraper(and press Return)
- Set your repository, token information, output file name, and filters in
creds.json - cd into
src/github_issue_scraper - Run the program
- From the Terminal:
python pull_issues.py
- From the Terminal:
- An output file will be written to
src/github_issue_scraper/output/[file specified in creds.json]
- Sample file
{
"REPOSITORY_NAME" : "iqss/dataverse",
"API_USERNAME" : "jsmith",
"API_ACCESS_TOKEN" : "access-token-for-your-repo",
"OUTPUT_FILE_NAME" : "github-issues.csv",
"GITHUB_ISSUE_FILTERS" : {
"labels" : "Component: API",
"assignee" : "",
"creator" : "",
"labels_to_exclude" : "Status: QA"
}
}API_USERNAME- your github username without the@API_ACCESS_TOKEN- see: https://github.com/blog/1509-personal-api-tokensOUTPUT_FILE_NAME- Always written tosrc/github_issue_scraper/output/(file name)GITHUB_ISSUE_FILTERS- Leave filters blank to exclude them.
- JSON below would include all
assigneevalues
- JSON below would include all
- Leave filters blank to exclude them.
"assignee" : "",- Comma separate multiple
labelsandlabels_to_exclude- Example of issues matching 3 labels:
Component: API,Priority: MediumandStatus: Design- (spaces between commas are stripped before attaching to api url)
- Example of issues matching 3 labels:
"labels" : "Component: API, Priority: Medium, Status: Design",- Location
src/ezid_helper
Scripts for two basic tasks:
- Update EZID target urls for migrated datasets.
- Quality check: Verify that the DOIs point to the correct url.
- Pipe
|delimited .csv file with the following data:- Dataset id (pk from the 4.0 db table dataset)
- Protocol
- Authority
- Identifier
- Sample rows
66319|doi|10.7910/DVN|29379
66318|doi|10.7910/DVN|29117
66317|doi|10.7910/DVN|28746
66316|doi|10.7910/DVN|29559
The input file is the result of a query from the postres psql shell:
- Basic query
select id, protocol, authority, identifier from dataset where protocol='doi' and authority='10.7910/DVN' order by id desc;- Basic query to pipe
|delimited text file
COPY (select id, protocol, authority, identifier from dataset where protocol='doi' and authority='10.7910/DVN' order by id desc) TO
'/tmp/file-name-with-dataset-ids.csv' (format csv, delimiter '|')(to do)
(to do)
These are basic tests using locustio.
- Requires virtualenvwrapper
- OS X install:
sudo pip install virtualenvwrapper - Don't forget the Shell Startup File: https://virtualenvwrapper.readthedocs.org/en/latest/install.html#shell-startup-file
- OS X install:
- Open a Terminal
- cd into
src/stress_tests - Make a virtualenv:
mkvirtualenv stress_tests - Install locustio:
pip install -r requirements/base.txt- This takes a couple of minutes
- Within
src/stress_tests, copycreds-template.jsontocreds.json(in the same folder) - Change the
creds.jsonsettings appropriately.
- Open a Terminal
- cd into
src/stress_tests - Type
workon stress_tests(and press Return)
- Set your server and other information in
creds.json - cd into
src/stress_tests - Run a test script. In this example run basic_test_02.py
- From the Terminal:
locust -f basic_test_02.py
- From the Terminal:
- Open a browser and go to: http://127.0.0.1:8089/
- Set your server and other information in
creds.json - cd into
src/stress_tests - Run a test script.
- From the Terminal:
locust -f dataverse_perf.py
- From the Terminal:
- Open a browser and go to: http://127.0.0.1:8089/ 5. HOST https://qa.dataverse.org 6. duration 10 minutes