MongoDB Atlas
The MongoDB Atlas origin reads data from MongoDB Atlas or MongoDB Enterprise Server. For information about supported versions, see Supported Systems and Versions in the Data Collector documentation.
The MongoDB Atlas origin reads from capped and uncapped collections. When you configure the origin, you define connection information, such as the connection string and credentials to use. You can specify SSL/TLS properties for an SSL/TLS-enabled MongoDB cluster. You can also use a connection to configure the origin.
You configure the database, collection, offset details, and read preference. You can define a custom filter and configure the origin to flatten nested structures.
You can optionally configure advanced options that determine how the origin connects to MongoDB, such as the maximum number of open connections to allow in the connection pool and the cursor type to use for capped connections.
When the pipeline stops, the origin notes where it stops reading. When the pipeline starts again, the origin continues processing from the last-saved offset by default. You can reset the origin to process all available data.
The origin can generate events for an event stream. For more information about dataflow triggers and the event framework, see Dataflow Triggers Overview.
Credentials
Based on the authentication used by MongoDB, configure the MongoDB Atlas origin to use no authentication, username/password authentication, or LDAP authentication. By default, no authentication is used.
- Authentication method
- Specify the authentication to use with the Authentication Method property on
the Credentials tab:
- None
- Username / Password
- LDAP
- Connection string
- If you prefer, you can specify credentials in the connection string on the Connection tab. However, specifying credentials on the Credentials tab is the recommended method.
Offset Field and Initial Offset
MongoDB uses the offset field to track the data to read. By default, the MongoDB Atlas
origin uses the _id
field as the offset field.
You can use any field type as the offset field. The origin determines the type of the
field based on the first record in the collection. If you do not use the
default_id
field, results are not guaranteed.
- Hexadecimal string
- Use a hexadecimal string representation of the Object ID, such as
62193d7cf7e3300b6646bdc8
. This is available when viewing collection documents using MongoDB Compass. - Datetime
- Use the following datetime
format:
YYYY-MM-DD HH:mm:ss
When you use a string field, specify the initial string to use as the initial offset.
Specifying Field Paths
- Data Collector format - Uses a slash ( / ) as a delimiter. Includes a leading slash.
- MongoDB format - Uses a period ( . ) as a delimiter.
Data Collector Format | MondoDB Format |
---|---|
/_id | _id |
/orders/address/line1 | orders.address.line1 |
/orders/lines[1]/quantity | orders.lines[1].quantity |
Read Preference
You can configure the read preference that the MongoDB Atlas origin uses. The read preference determines how the origin reads data from different members of the MongoDB replica set.
- Primary - Requires reading from the primary member.
- Primary Preferred - Prefers reading from the primary, but allows reads from a secondary member.
- Secondary - Requires reading from a secondary member.
- Secondary Preferred - Prefers reading from a secondary, but allows reads from a primary when necessary.
- Nearest - Reads from the member with the least network latency.
By default, the origin uses Secondary Preferred to avoid making unnecessary requests to the primary member.
Custom Filter
You can specify a custom filter to reduce the data that MongoDB passes to the origin to process. Use a custom filter to return a subset of all available data before processing. Use the MongoDB query operator syntax for filters when you define a custom filter.
city
is San
Francisco
, you can use the following custom filter:
{ city: “San Francisco” }
For more information, including the appropriate syntax for query operators, see the MongoDB Atlas documentation.
Event Generation
The MongoDB Atlas origin can generate events when it completes processing all available data and the configured batch wait time has elapsed.
- With the Pipeline Finisher executor to
stop the pipeline and transition the pipeline to a Finished state when
the origin completes processing available data.
When you restart a pipeline stopped by the Pipeline Finisher executor, the origin continues processing from the last-saved offset unless you reset the origin.
For an example, see Stopping a Pipeline After Processing All Available Data.
- With a destination to store event information.
For an example, see Preserving an Audit Trail of Events.
For more information about dataflow triggers and the event framework, see Dataflow Triggers Overview.
Event Records
Record Header Attribute | Description |
---|---|
sdc.event.type | Event type. Uses the following event type:
|
sdc.event.version | Integer that indicates the version of the event record type. |
sdc.event.creation_timestamp | Epoch timestamp when the stage created the event. |
- no-more-data
- The MongoDB Atlas origin generates a no-more-data event record differently
depending on the collection and cursor type:
- When the collection type is capped and the origin uses a tailable cursor type, the origin generates the event record after processing all available records, and the number of seconds specified for Max Batch Wait Time elapses without any new data appearing.
- When the collection type is not capped or when the origin uses a different cursor type for capped collections, the origin generates the event record immediately after reading all available data.
Enabling SSL/TLS
- Atlas/System CA - Connects to a MongoDB Atlas cluster. You can also use this when your certificates or keys have already been specified at the JVM level.
- Server Validation (1 Way TLS) - Connects to an SSL/TLS-enabled MongoDB Enterprise Server cluster when the client needs to validate the server certificate and does not need to prove client identity.
- Server and Client Validation (2 Way TLS) - Connects to an SSL/TLS-enabled MongoDB Enterprise Server cluster when the client needs to validate the server certificate and the server also validates the client key. This occurs when the cluster is set up to require client certificates.
- JKS (Java Keystore)
- PEM (text-based)
- DER (text-based)
- PKCS #7 / P7B
- PKCS #12 / P12 / PFX
- Private keys inside PEM, DER, or PKCS #12 encoded as PKCS#1 or PKCS#8
If the files are in PEM or DER plain text format, you can
provide the text in the stage properties. The certificate should begin and end with text
such as: —BEGIN CERTIFICATE—
or —END PRIVATE KEY—
.
Otherwise, you provide a path to the certificate file.
MongoDB Data Types
When the MongoDB Atlas origin reads from MongoDB, it converts standard MongoDB data types to the following Data Collector data types.
The origin can also convert supported BSON types to Data Collector data types. For more information, see Reading BSON Types.
Standard MongoDB Type | Data Collector Type |
---|---|
Array | List |
Binary | Byte Array |
Boolean | Boolean |
Date | Date |
Double | Double |
Int32 | Integer |
Int64 | Long |
JavaScript | String |
Object | List-Map |
String | String |
Timestamp | Datetime |
Reading BSON Types
When reading from MongoDB, the MongoDB Atlas origin converts standard MongoDB data types to Data Collector data types as described in MongoDB Data Types.
The origin converts supported BSON data types to Data Collector
data types as well. When converting BSON data types, the origin adds a field attribute
named bsonType
to the converted field.
Some supported
BSON data types encode additional information with the data. Where this occurs, the
information is included as additional attributes for the field. For example, a
BsonTimestamp can encode an ordinal value along with the date and time. When the origin
reads the data, it converts the field to a Datetime field with an
ordinal
field attribute set to the ordinal value encoded with the
data.
BSON Data Type | Data Collector Type | Field Attributes and Values |
---|---|---|
Binary | Byte Array | bsonType : Binary |
BsonDbPointer | Map field with the following subfields:
|
bsonType : Bson_Db_Pointer |
BsonRegularExpression | String |
|
BsonTimestamp | Datetime |
|
Code | String | bsonType : Code |
CodeWithScope | String | bsonType : Code_With_Scope |
DBRef | Map field with the following subfields:
|
bsonType : Db_Ref |
Decimal128 | Decimal | bsonType : Decimal128 |
Null | String with null value | bsonType : Null |
ObjectId | String containing the 24-character hexadecimal value of the Object Id |
|
Symbol | String | bsonType : Symbol |
Undefined | String with null value | bsonType : Undefined |
Configuring a MongoDB Atlas Origin
Configure a MongoDB Atlas origin to read data from MongoDB Atlas or MongoDB Enterprise Server.