Get Started
Plan and execute your first data collaboration on Senate
Prerequisites
Scope technical and analytical requirements early
If your organization will be analyzing datasets on Senate, consider the following
(1) How much processing power will you need?
Please ask your Senate Host what Workspace size will be provided for your project.
(2) What is your teams’ working arrangement?
More than 1 authorized user can have access the same Workspace, however, to ensure optimal performance, only 1 user can access the Workspace at the same time. Do you require team members to access datasets at the same time for analysis?
(3) What software versions do you need?
Senate supports the following software and versions:
Windows workspaces run Anaconda3 (Python 3.7) 2019.10 by default which includes Jupyter Notebook and QT Console. Anaconda in the standard Workspace currently comes with any packages ticked on this page: https://docs.anaconda.com/anaconda/packages/old-pkg-lists/2019.03/py3.7_win-64/.
Linux workspaces run Python 3.6 by default and the base software installs are made to Python 3.6. Anaconda in the standard Linux Workspace currently comes with any packages ticked on this page: https://docs.anaconda.com/anaconda/packages/py3.6_linux-64/
(4) What pip installable Python packages do you need?
Analysts are able to install library dependencies without needing Data Republic’s help, giving you more control over your Workspace.
Please note: packages that have external binary dependencies, or other components not directly installable from PyPi, PythonHosted, CRAN, or Anaconda will require a support request to be installed.
To enable you to install your own Python and R libraries without compromising data protection, all installation requests go through our special proxy service. The proxy only allows “read only” requests to install standard packages from the specific repositories that DR has whitelisted.
- Python packages from PyPi and Anaconda can be installed by users to Workspaces
- R packages from CRAN within RStudio (Linux) can be installed by users to Workspaces.
You can also request further customization of your existing workspace by creating a .txt requirements file listing any packages and versions you need (see https://pip.pypa.io/en/stable/reference/pip_freeze/). Please email the .txt requirements file to support@datarepublic.com to discuss your package requirements.
Contact Data Republic
Email support@datarepublic.com to discuss your requirements if:
- You require increased processing power above the standard Workspace options
- You require additional Workspace(s) to allow users to access and analyze datasets at the same time.
- You require different software or versions
- You have generated a .txt requirements file for Python packages (with versions compatible with workspace specifications above) and would like to discuss whether this can be installed in the Workspace.
Get started in 4 steps
STEP 1: Join the project
The Senate Host organization who has invited your organization as a Senate Guest will typically set up the Project and draft the data license agreement in Senate.
Actions required from you
- Request user logins and assign roles.
- Use Project Conversations to regularly communicate progress and/or ask questions.
Hint: Work closely with the project lead from the Senate Host organization to coordinate activities. Set up a call to introduce your team and plan your project timeline, assign tasks and responsibilities.
STEP 2: Data Preparation
Anyone bringing data and/or files will need to complete this step.
You can access step by step instructions in our online help articles:
Upload Data, Creating databases and tables, Creating packages for exchange, Add your data package to your project.
STEP 3: Data License approval
Once all data packages have been added to the Project, terms around data-use, access and commercials can be finalized in the Data License. Within the Project’s data license, your organization and your Senate host can agree to apply your own legal framework or use Data Republic’s Common Legal Framework for data exchange. Once approved, the Data License is legally binding between all parties.
For more step by step guides on Data Licenses, please see our help articles on Creating and approving a data license and How to write your first data license.
Hint: When editing a license use the default conversation in your project, or create a new conversation with particular members to communicate directly on license changes.
STEP 4: Request and analyze approved datasets
Upon data license approval, the Data Licensee will need to request a Workspace to analyze approved datasets.
If you are the Data Licensee:
- Request a Discovery Workspace, select software and add any users who will require access to the Workspace, then check the box to agree to the terms before submitting your request for approval. Data Republic will load data packages to the Workspace and install required software if approved. This may take up to 1 business day.
- Workspace users will be notified by email when the Workspace is ready to connect to. Users can connect to the Workspace to access approved datasets for analysis.
- If permitted by the Data License, users can request to download approved outputs from the Workspace:
Closing Workspaces
If you are the Data Licensee:
Once you have created your data product and/or have downloaded approved outputs, you can request to have your Workspace terminated and user access revoked.
Please note: Once a Workspace is terminated, all associated content and data within the Workspace will be deleted and cannot be recovered.
Set your team up for success
Congratulations, your organization has signed the Senate Guest License.
The below activities can be completed concurrently to ensure that you and your team are ready to execute your first data collaboration project.
STEP 1: Define data use case and requirements
Schedule a meeting to understand the data opportunity and value.
- What problem you are trying to solve and how?
- What datasets are needed / available ?
- What outcomes are expected?
- What outputs will be delivered?
- What is the timeline and priorities?
- What are the legal and commercial terms for data use?
Time: 2-3 hours
Attendees: Your organization, Senate host organization
Seek advice from: Legal, risk, commercial, data governance, and data teams
STEP 2: Internal alignment and understanding
Hold meetings with the sponsor and key stakeholders to get everyone across what you want to achieve.
- Define success
- Be clear on expectations, data, and resource requirements
- Identify concerns to be addressed
- Agree on an escalation point if needed
Time: Half day
Attendees: Your organization
Seek advice from: Legal, risk, commercial, data governance, and data teams
STEP 3: Project kick-off
Host a meeting to onboard your team and explain next steps.
- Define roles and responsibilities for project execution
- Stakeholder engagement matrix and escalation points
- Timeline, milestones and check-in points for project tracking
Time: 2 hours
Attendees: Your organization, Senate host organization
Project members involved: Project lead, data engineer / analyst, data license approver
STEP 4: Scope technical and analytical requirements for Senate
Refer to Tools for analysis to help you scope your requirements.
Once you have created your list of requirements, contact Data Republic to discuss.
Time: 1 hour
Attendees: Your organization, Data Republic
Project members involved: Project lead, data engineer / analyst, Data Republic support member