File API (Open Data for Industries)

Fetch records or request data file locators.

Applications or users can use File API to:
  • Request file location data from the Open Data for Industries environment to generate an individual signed URL per file.
  • Upload files to the Open Data for Industries environment by using a signed URL.

The API supports upload and download of files and saving and retrieval of metadata for files in the Open Data for Industries environment.

To use the File API, see the File API reference.

Learn more

File API usage

To use the File API, follow these steps:

  1. Choose a partition.

    Open Data for Industries fetches file definitions for different tenants, depending on the different accounts on the system. A user may belong to many accounts, for example, to an own account and a customer account. When you log in to the industry applications, you choose which account is active. To use the File API, specify the active account by using the data-partition-id parameter.

  2. Create data groups.

    Create groups by using the Entitlements API (Open Data for Industries). Data groups use prefix data. For data access authorization purposes, the groups in the following example must exist already.

    • service.delivery.viewer
    • service.file.editors
    • service.file.viewers
  3. Manage a single file or a collection of files by using signed URLs.

    You can use the File API to download files that are uploaded to the Open Data for Industries storage.

    1. Download a single file by using an object reference.
      1. Get the upload URL.

        This request generates a reference for the file on the Cloud Object Storage . The reference is used to bind the meta information with the file. A signed URL is generated and can be used to upload the file.

        curl --location --request GET 'https://{{cpd_url}}/osdu-file/api/file/v2/files/uploadURL' \
        --header 'data-partition-id: {{data-partition-id}}' \
        --header 'Content-Type: application/json' \
        --header 'Authorization: Bearer {{access_token}}'
        
      2. Get the generated metadata ID.

        This request generates a reference for the file metadata on the Open Data for Industries document storage. The reference is used to download the file.

        curl --location --request POST 'https://{{cpd_url}}/osdu-file/api/file/v2/files/metadata' \
        --header 'x-ms-blob-type: BlockBlob' \
        --header 'data-partition-id: {{data-partition-id}}' \
        --header 'Authorization: Bearer {{access_token}}' \
        --header 'Content-Type: application/json' \
        --data-raw '{
            "data": {
                "Endian": "BIG",
                "Checksum": "string",
                "DatasetProperties": {
                    "FileSourceInfo": {
                        "FileSource": "{{file_source}}}",
                        "Name": "TestCSV_status.csv"
                    }
                },
                "ExtensionProperties": {
                    "Classification": "Raw File",
                    "Description": "An text further describing this file example.",
                    "ExternalIds": [
                        "string"
                    ],
                    "FileDateCreated": {},
                    "FileDateModified": {},
                    "FileContentsDetails": {
                        "TargetKind": "{{target_kind}}",
                        "FileType": "csv"
                    }
                },
                "EncodingFormatTypeID": "string",
                "Name": "{{file_name}}",
                "SchemaFormatTypeID": "string"
            },
            "meta": [],
            "id": "{{record_id}}",
            "version": 1613026613300181,
            "kind": "opendes:wks:dataset--File.Generic:1.0.0",
            "acl": {
                "viewers": [
                    "data.default.viewers@opendes.ibm.com"
                ],
                "owners": [
                    "data.default.viewers@opendes.ibm.com"
                ]
            },
            "legal": {
                "legaltags": [
                    "{{legal_tag}}"
                ],
                "otherRelevantDataCountries": [
                    "US"
                ],
                "status": "compliant"
            },
            "tags": {
                "dataflowId": "test-dataflowid-proxy"
            }
        }'
        
      3. Get the signed URL for the downloaded file.

        This request generates a signed URL that is needed to download the file, which is identified by its metadata ID.

        curl --location --request GET 'http://{{cpd_url}}/osdu-file/api/file/v2/files/{{file_record_meta_id}}/downloadURL' \
        --header 'data-partition-id: opendes' \
        --header 'Authorization: Bearer {{access_token}}'
        
      4. Use the signed URL to download the file. You are not obliged to provide any credentials to access the URL.

        Paste the signed URL on any HTTP client, for example, your browser, and it downloads the file.

    2. Download a single file or a collection of files, by using file name reference.
      1. Get Storage Instructions URL.

        This request generates a reference for the file on the Cloud Object Storage . The reference is used to bind the metadata information with the file.

        The request generates a signed URL and you can use it to upload the file.

        This request also generates storage instructions to call the generated endpoint from other client codes. Hence, the POST /files/storageInstructions endpoint is used for integration with other client systems.

        curl --location --request POST 'https://{{cpd_url}}/osdu-file/api/file/v2/storageInstructions' \
        --header 'data-partition-id: {{data-partition-id}}' \
        --header 'Authorization: Bearer {{access_token}}'
        
      2. Get the generated record ID by using the Storage API PUT /records endpoint.

        The request creates a binding between the Storage instructions and the Open Data for Industries cloud storage location. The binding is used to identify the file reference with the generated resource ID.

        curl --location --request PUT 'https://{{cpd_url}}/osdu-storage/api/storage/v2/records' \
        --header 'data-partition-id: {{data-partition-id}}' \
        --header 'Authorization: Bearer {{access_token}}' \
        --header 'Content-Type: application/json' \
        --data-raw '[{
        	"kind": "opendes:osdu:dataset-registry:3.0.6",
        	"legal": {
        		"legaltags": [
        			"{{legaltag}}"
        		],
        		"otherRelevantDataCountries": [
        			"{{country}}"
        		],
        		"status": "compliant"
        	},
        	"acl": {
        		"viewers": [
        			"{{viewer_acl}}"
        		],
        		"owners": [
        			"{{owner_acl}}"
        		]
        	},
        	"data": {
        		"ResourceID": "{{resource_id}}",
        		"ResourceTypeID": "srn:type:file/{{file_type}}:",
        		"ResourceSecurityClassification": "srn:reference-data/ResourceSecurityClassification:RESTRICTED:",
        		"ResourceSource": "{{resource_source}}",
        		"ResourceName": "{{resource_name}}",
        		"ResourceDescription": "{{resource_description}}",
        		"DatasetProperties": {
        			"FileSourceInfo": {
        				"FileSource": "{{fileSource}}",
        				"Name": "Hello.txt",
        				"PreLoadFilePath": "s3://oc-cpd-opendes-staging-bucket/{{fileSource}}"
        			}
        		}
        	}
        }]'
        
      3. Get the signed URL to download the file, by using the record ID. Use Storage API /files/retrievalInstructions endpoint.

        The request uses the resource ID to download the associated resources. Referencing the resources with some IDs helps you to avoid the need to remember the metadata information about the resources.

        curl --location --request POST 'https://{{cpd_url}}/osdu-file/api/file/v2/files/retrievalInstructions' \
        --header 'data-partition-id: {{data-partition-id}}' \
        --header 'Authorization: Bearer {{access_token}}' \
        --header 'Content-Type: application/json' \
        --data-raw '{
          "datasetRegistryIds": [
                                  "{{resource_id}}"
                                ]
        }'
        
      4. Use the signed URL to download file. You are not obliged to provide any credentials to access the URL.

        Paste the signed URL on any HTTP client, for example, your browser, and it downloads the file.

  4. Use the provided API methods to fetch files from the storage layer of Open Data for Industries.
Table 1. File service endpoints
API endpoint Description API reference
GET /files/{id}/downloadURL Returns a signed URL for downloading a file, based on its ID. Get a signed URL for a specific file
GET /files/uploadURL Returns a signed URL for a single file. Use the URL to upload the file. Get location information for a file
POST ​/delivery​/GetFileSignedUrl Returns signed URLs for all files in the request that enable short-term access to the files. Get a signed URL for a file
GET /files/{id}/metadata Get file metadata for a specific file, based on its file ID. Get metadata for a specific file
POST /files/metadata Save file metadata based on a file ID. Create metadata for a file
POST /files/storageInstructions Returns unsigned and signed URL for uploading a file. Get Storage instructions
POST /files/retrievalInstructions Returns an unsigned and signed URL to download the data set file sources. File retrieval instructions
POST /files/copy Copies the file source identified by the record ID from "staging bucket" to "persistent bucket". File copy

File API endpoint permissions

Table 2. API method permissions
Endpoint URL Method Minimum permissions required
/files/{id}/downloadURL GET service.file.viewers
/files/uploadURL GET service.file.editors
/delivery​/GetFileSignedUrl POST service.delivery.viewer
/files/{id}/metadata GET service.file.viewers
/files/metadata POST service.file.editors
/files/storageInstructions POST NO PERMISSION NEEDED
/files/retrievalInstructions POST NO PERMISSION NEEDED
/files/copy POST NO PERMISSION NEEDED