Using Excel Data Integration

The configuration file is located for Excel data integration in the gene-model-dom module; its default name is "excel-mapping.yml". As a result, the default configuration file can be found using this path:

gene-model/gene-model-dom/excel-mapping/excel-mapping.yml

Note that this default configuration file is generated at the same time as the Java classes that implement the model, that is, on the first build of the application.

You can use a distinct file name or folder by using the property gene.excel.mapping in the application.yml of the Data Service extension.

The default configuration file contains documentation on how to implement your customized mappings.

An imported Excel file must have a precise structure in terms of sheet and column names. However, the names of the sheets and the columns in each sheet can be configured using the configuration file.

The Excel file processor will first use the configuration file. If there is no configuration for a specific entity or entity field, the Excel file processor will use the default behavior.

A different behavior can be configured in the configuration file for some of the data types.

For instance, a date format to import dates formatted as strings can be specified in the configuration file, globally or per column. Date formats are based on a simple sequence of letters and symbols. For example, "d MMM yyyy" will format 2011-12-03 as "3 Dec 2011". Here are more examples:

yyyy-MM-dd                  2009-12-31
dd-MM-YYYY                  31-12-2009
yyyy-MM-dd HH:mm:ss         2009-12-31 23:59:59
HH:mm:ss.SSS                23:59.59.999
yyyy-MM-dd HH:mm:ss.SSS     2009-12-31 23:59:59.999
yyyy-MM-dd HH:mm:ss.SSS Z   2009-12-31 23:59:59.999 +0100

The following pattern letters are defined:

Symbol  Meaning                     Presentation      Examples
  ------  -------                     ------------      -------
G       era                         text              AD; Anno Domini; A
u       year                        year              2004; 04
y       year-of-era                 year              2004; 04
D       day-of-year                 number            189
M/L     month-of-year               number/text       7; 07; Jul; July; J
d       day-of-month                number            10

Q/q     quarter-of-year             number/text       3; 03; Q3; 3rd quarter
Y       week-based-year             year              1996; 96
w       week-of-week-based-year     number            27
W       week-of-month               number            4
E       day-of-week                 text              Tue; Tuesday; T
e/c     localized day-of-week       number/text       2; 02; Tue; Tuesday; T
F       week-of-month               number            3

a       am-pm-of-day                text              PM
h       clock-hour-of-am-pm (1-12)  number            12
K       hour-of-am-pm (0-11)        number            0
k       clock-hour-of-am-pm (1-24)  number            0

H       hour-of-day (0-23)          number            0
m       minute-of-hour              number            30
s       second-of-minute            number            55
S       fraction-of-second          fraction          978
A       milli-of-day                number            1234
n       nano-of-second              number            987654321
N       nano-of-day                 number            1234000000

V       time-zone ID                zone-id           America/Los_Angeles; Z; -08:30
z       time-zone name              zone-name         Pacific Standard Time; PST
O       localized zone-offset       offset-O          GMT+8; GMT+08:00; UTC-08:00;
X       zone-offset 'Z' for zero    offset-X          Z; -08; -0830; -08:30; -083015; -08:30:15;
x       zone-offset                 offset-x          +0000; -08; -0830; -08:30; -083015; -08:30:15;
Z       zone-offset                 offset-Z          +0000; -0800; -08:00;
[       optional section start
]       optional section end

In the configuration file, you can also choose between a simple duration format, such as [hh]:mm:ss and the ISO-8601 format.

To configure the duration format, you will use the following syntax in the configuration file:

duration:
  format: "DURATION"

or

duration:
  format: "ISO_8601"

Finally, in the generated Excel file, you can specify to auto-size columns by setting the property services.data.excel.autosize-columns to true in the application.yml of the Data Service extension. Note that auto-sizing columns is very expensive in generation time, so you should use it with caution.