Design Considerations
Terminologies:
We need to understand the XML keywords that will be used across this article
1. XML Data file: Data file with .xml extension with all the data that should be loaded into the Tables.
2. XSD: Its schema definition file. This file defines the structure which XML Data file should have. Also it defines the range of values for data fields in XML Data file, against which the data is validated.
Normally in PLSQL interface the validations on the data are performed using PLSQL code. In XML interface the XSD file can be used to perform following validations on XML Data file
1. Verifies the structure of the XML File
2. Checks if mandatory values are missing
3. Checks if the values for a particular field are within range of valid values.
3. XML schema URL: This is the value of parameter in XML Data file header definition which indicates the name of XSD that should be used for validating the XML Data file.
Interface design considerations:
1. Structure for the XML data file should be decided and accordingly XSD file should be developed. XSD file will define the exact structure of XML file and range of valid values for various data fields in the XML.
2. Location of the XSD and XML File to be processed needs to be decided. In current case the XSD and XML Data file resides in predefined directories in UNIX Box.
3. Usually XSD file is received and its registration process is done at the start. New XSD file registration is required only if there is modification in structure of XML data or range of values for any field in XML file. Incase a new XSD file is available then the Old XSD record in database is end dated and new XSD file record is made active.
4. Attributes in XML File such as ‘xsdversion’ can be used to identify the XSD version to be used for validating the XML File.
5. Archiving for XML and XSD files can be carried out by moving them to backup directories once processing is complete.
6. Frequency at which the XML Data file will have to be processed will depend on Business requirements and can be monthly or daily.
7. XML and XSD are stored in database as LOB objects before any processing for historical records.
8. Data Extraction in done by using Oracle XML DB packages available with release 10g of Oracle database.
9. The next article explains the process for loading the XML data into one single Table. Using some modifications the data can be loaded into Multiple or Parent-Child tables. Single XML Data file can contain the Header as well as Line record.
Process Diagram:
Design
XSD Registration Process:
XSD is schema definition file and its needs to be registered in Oracle database. This is one time process and will be required to be re-executed only if the XSD is modified. Following are the steps for XSD Registration
1. XSD file is available in the UNIX directory. This file is read using UTL_FILE and inserted into the table as LOB object using DBMS_LOB package.
---Insert a Empty CLOB record with XSD details
INSERT INTO testxml_sources
( source_id
,source_name
,xml_schema_name
,xml_schema
,start_date
,end_date
)
VALUES (l_source_id
,p_source_name
,p_schema_url
,EMPTY_CLOB()
,SYSDATE
,NULL
)
RETURNING
--Return the LOB locator
xml_schema INTO l_xml_schema_definition;
---Create a BFILE Pointer for the XSD file residing in <UNIX DIR>
l_src_file := BFILENAME(<UNIX DIR>, <XSD_FILE_NAME>);
--Open the file
DBMS_LOB.FILEOPEN(l_src_file, DBMS_LOB.FILE_READONLY);
-- Read the file into the l_xml_schema_definition from l_src_file
DBMS_LOB.LOADFROMFILE ( dest_lob => l_xml_schema_definition
, src_lob => l_src_file
, amount => DBMS_LOB.LOBMAXSIZE
);
DBMS_LOB.CLOSE (l_src_file );
2. If a new XSD File comes in because of any changes in XSD definition, the record for the old XSD in the table is end dated and a new record is inserted into the table with Latest XSD file. This makes sure that Old XSD records retrieval is possible from database for any purpose.
UPDATE testxml_sources
SET end_date = TRUNC(SYSDATE)
WHERE source_name = p_source_name
AND xml_schema_name = p_schema_url
AND end_date IS NULL
AND SYSDATE BETWEEN start_date
AND NVL(end_date,SYSDATE+1);
3. Now the XSD in database is registered. Incase any Old XSD schema definition is present its deleted and new XSD is registered. DBMS_XMLSCHEMA function used for registration creates a XSD record in standard table
--Get the XSD from Database
SELECT xml_schema
,xml_schema_name
INTO l_schema
,l_schema_name
FROM testxml_sources
WHERE source_name = p_source_name
AND SYSDATE BETWEEN start_date AND NVL(end_date,SYSDATE+1);
--Delete Old definition
DBMS_XMLSCHEMA.DELETESCHEMA(l_schema_name,
DBMS_XMLSCHEMA.DELETE_CASCADE);
--Register New Definition
DBMS_XMLSCHEMA.REGISTERSCHEMA(schemaURL =>l_schema_name
,schemaDoc => l_schema);
XML Data file Processing:
XML file holds the data that is to be loaded and validated. XML file contains the name of XSD that will be by default used for validation of the XML file. However in the current process explained here we have forced the XSD to be used for XSD validation. This process is three stages
1. Importing the File in Database:
XML Data file resides in the UNIX BOX in predefined path. This file is first imported
as CLOB object into the XML Data file storage table.
---Insert a Empty CLOB record with XML File details
INSERT INTO testxml_data_files
(source_id
,file_id
,xml_file
,load_date
,file_name
)
VALUES
(l_source_id
,l_file_id
,EMPTY_CLOB()
,p_xml_file_name
);
RETURNING
--Return the LOB locator
xml_file INTO l_xml_file;
---Create a BFILE Pointer for the XML file residing in <UNIX DIR>
l_src_file := BFILENAME(<UNIX DIR>,<XML FILE>);
--Open the file
DBMS_LOB.FILEOPEN(l_src_file, DBMS_LOB.FILE_READONLY);
-- Read the file into the l_xml_file from l_src_file
DBMS_LOB.LOADFROMFILE ( dest_lob => l_xml_file
, src_lob => l_src_file
, amount => DBMS_LOB.LOBMAXSIZE
);
DBMS_LOB.CLOSE (l_src_file );
2. Validate the XML Data file:
The XML Data file is validated against the XSD file using xmltype.schemavalidate
function. This function by default will use the XSD specified inside the XML for
validation, but here we have forced it to use the XSD passed. Incase of any
validation failure of XML against the XSD, it will throw an Exception.
--Declare following variables
l_xml_data testxml_data_files.xml_file%TYPE;
l_xml_schema_name testxml_sources.xml_schema_name%TYPE;
l_xml_type_0 XMLTYPE;
l_xml_type XMLTYPE;
--Fetch the Xml file details from the table
SELECT sdf.xml_file
,sxs.xml_schema_name
INTO l_xml_data
,l_xml_schema_name
FROM testxml_data_files sdf
,testxml_sources sxs
where sxs.source_id = sdf.source_id
AND sxs.xsd_version = sdf.xsd_version
and sxs.source_name = p_source_name
and sdf.file_id = p_file_id
AND SYSDATE BETWEEN sxs.start_date
AND NVL(sxs.end_date,SYSDATE+1);
--Cast the CLOB into XML Type data type
l_xml_type_0 := XMLTYPE.createxml(l_xml_data);
--Specify the Name of XSD to be used for validation
l_xml_type := l_xml_type_0.createschemabasedxml(l_xml_schema_name);
l_xml_type.setschemavalidated(1);
--Validate the XML Data file
xmltype.schemavalidate(l_xml_type);
3. Load the XML Data into the staging Table.
The data contained in the XML File once validated is loaded into the Staging table
using standard XML Functions. The PLSQL code used here is specific to the
structure of XML file.
--Declare variables
l_xml_data testxml_data_files.xml_file%TYPE;
l_xml_schema_name testxml_sources.xml_schema_name%TYPE;
l_xml_type XMLTYPE;
--Fetch the Data file record in CLOB object
SELECT sdf.xml_file
,sxs.xml_schema_name
,sxs.source_id
INTO l_xml_data
,l_xml_schema_name
,l_source_id
FROM testxml_data_files sdf
,testxml_sources sxs
WHERE sxs.source_id = sdf.source_id
AND sxs.source_name = p_source_name
AND sdf.file_id = p_file_id
AND SYSDATE BETWEEN sxs.start_date
AND NVL(sxs.end_date,SYSDATE+1);
--Cast CLOB data into an XML type
l_xml_type :=XMLTYPE.CREATEXML(l_xml_data);
First FOR Loop is opened for the Parameters which are attributes of the File and
not the actual data of the file. These attributes of the XML File can be used to take
decisions related to processing of the file.
E.g. the attribute xsdversion defines the version of XSD to be used for validating
the XML File
FOR l_invoice_rec IN
( SELECT EXTRACTVALUE(VALUE(XMLD), '/Invoice /@title') title
, EXTRACTVALUE(VALUE(XMLD), '/Invoice/@xsdversion') xsdversion
, EXTRACTVALUE(VALUE(XMLD), '/Invoice/@purpose') purpose , EXTRACTVALUE(VALUE(XMLD), '/Invoice/@create_date') create_date
, EXTRACTVALUE(VALUE(XMLD), '/Invoice/@create_time') create_time
, EXTRACTVALUE(VALUE(XMLD), '/Invoice/@receiver_name') receiver_name
, ROWNUM rn
FROM
TABLE(XMLSEQUENCE(l_xml_type.EXTRACT(‘Invoice '))) XMLD)
LOOP
--Open loop of processing data for Invoice Header
FOR l_header_rec IN
(SELECT
EXTRACTVALUE(VALUE(XMLD), '/header/transaction_date') transaction_date
, EXTRACTVALUE(VALUE(XMLD), '/header/Invoicenum’) Invoicenum
, EXTRACTVALUE(VALUE(XMLD), '/header/Customer ') Customer,
, EXTRACTVALUE(VALUE(XMLD), '/header/Shipto ') Shipto,
, EXTRACTVALUE(VALUE(XMLD), '/header/TotAmount ') TotAmount,
, ROWNUM rn
FROM
TABLE(XMLSEQUENCE(l_xml_type.EXTRACT('Invoice/header'))) XMLD)
LOOP
--Open the Loop for Invoice Line data
FOR l_line_rec IN
( SELECT
EXTRACTVALUE(VALUE(XMLD), '/line /item') item,
,EXTRACTVALUE(VALUE(XMLD), '/line/price') price,
,EXTRACT(VALUE(XMLD),'/transaction/amount') amount
,ROWNUM rn
FROM
TABLE(XMLSEQUENCE(l_xml_type.EXTRACT(‘Invoice/Header/line’))) XMLD)
LOOP
INSERT INTO INOVICE_LINE_STG
(line_id
,transaction_date
,invoicenum
,customer
,shipto
,item
,price
,amount)
VALUES
(line_id.nextval
, l_header_rec.transaction_date
, l_header_rec.Invoicenum
,l_header_rec.Customer
,l_header_rec.Shipto
,l_line_rec.item
,l_line_rec.price
,l_line_rec.amount);
END LOOP;
END LOOP;
END LOOP;
Though in above step the Invoice Header and Line data is loaded into the same table, with use of some PLSQL Code this data can be loaded into two different tables based on Business requirement.
No comments:
Post a Comment