Introduction
This article provides a set of sample code snippets for loading data from Oracle Data as a Service for Customer Intelligence (DaaS for CI) into an Oracle Business Intelligence Cloud Service (BICS) Schema Service Database.
“Oracle Data as a Service for Customer Intelligence (DaaS for CI) enables you to analyze your own unstructured text as well as public social data, so you can translate insights into actions that benefit you and your customers. DaaS for CI provides a comprehensive feed of social media and news data from over 40 million sites worldwide, with new sites being added every day. These sites include social networks, blogs, video sharing sites, forums, news, and review sites.” See Using Oracle Data as a Service for Customer Intelligence for more information on DaaS for CI.
The code snippets utilize the DaaS for CI Enriched Social Data Feed API to retrieve the data. The DaaS for CI data is then inserted into a BICS schema service database table using PL/SQL.
“The Enriched Social Data Feed API provides access to public social data from millions of users that has been gathered, cleaned, and analyzed by Oracle. Based on topic requirements that you define, you use the API to retrieve files containing data that is ready for your use in your Business Intelligence systems. Using the Enriched Social Data Feed alongside existing BI tools such as Oracle Business Intelligence Enterprise Edition, for example, you can visualize in detail the reaction of consumers to an advertising campaign, and much more. The files that are created and pre-analyzed, and that you retrieve using the Enriched Social Data Feed API, are often referred to as your data feed.” See “Create Your Enriched Social Data Feed API Requests” for more info on the API.
Diagram 1, illustrates the Enriched Social Data Feed API returning a list of message URL / paths. This list is commonly referred to as the “data feed”. Each link is a reference to a .gz file. Contained inside each .gz file is an .xml file displaying information for that individual message. Diagram 2, shows an example of the contents of the message .xml file. This article will walk through the process of lopping through each data feed .gz file and parsing the data contained in the .xml message file to a relational database table.
Diagram 1: Data Feed:
Diagram 2: Contents of Data Feed Message .xml file.
The intended audience for this article is BICS developers. That said, the code snippets provided can be ran in any Oracle database configured with Apex (independently of BICS). With appropriate code adjustments the PL/SQL can be converted to run outside of Apex on an Oracle on-prem or DbaaS schema (i.e. using the UTL_HTTP package). Additionally, the principle concepts covered can also be used as a starting point for DaaS for CI integrations with non-Oracle relational databases. General themes and concepts covered in the article are also beneficial for those wanting to integrate DaaS for CI with other products using alternative programming languages.
Main Article
Step One – Review Documentation
Click here for the DaaS for CI documentation home page.
This article focuses on the Enriched Social Data Feed API.
The Enriched Social Data Feed API documentation can be accessed from the home page by clicking on “Create Your Enriched Social Data Feed API Requests“.
Step Two – Create Table
Run the below code snippet to create the SOCIAL_DATA_FEEDS table that will be used to store the DaaS for CI data in BICS.
Click here for a text version of the create table code snippet.
CREATE TABLE SOCIAL_DATA_FEEDS
(“FNAME” VARCHAR2(1000 BYTE),
“PUB_DATE” VARCHAR2(30 BYTE),
“DESCRIPTION” VARCHAR2(40 BYTE),
“RELEASE” VARCHAR2(80 BYTE),
“COUNT” NUMBER,
“MSG_ID” NUMBER,
“MSG_PUBLISHED_ON” DATE,
“MSG_SOURCE” VARCHAR2(90 BYTE),
“MSG_SOURCE_TYPE” VARCHAR2(40 BYTE),
“MSG_LINK” VARCHAR2(120 BYTE),
“MSG_TITLE” VARCHAR2(200 BYTE),
“MSG_SOURCE_ID” VARCHAR2(200 BYTE),
“MSG_AUTH_NAME” VARCHAR2(100 BYTE),
“MSG_AUTH_FOLLOWERS” NUMBER,
“MSG_AUTH_FRIENDS” NUMBER,
“MSG_AUTH_KLOUT” NUMBER,
“MSG_AUTH_SRC_ID” VARCHAR2(200 BYTE),
“TOPIC_ID” NUMBER,
“TOPIC_NAME” VARCHAR2(50 BYTE),
“SNIPPET_ID” NUMBER,
“SNIPPET_READABILITY” VARCHAR2(40 BYTE),
“SNIPPET_SUBJECTIVITY” VARCHAR2(40 BYTE),
“SNIPPET_TONALITY” NUMBER,
“SNIPPET_ANCHOR” VARCHAR2(40 BYTE),
“SNIPPET_TEXT” VARCHAR2(1000 BYTE),
“DIMENSION_ID” NUMBER,
“DIMENSION_NAME” VARCHAR2(60 BYTE)
) ;
Step Three – Insert Records
This step provides the “insert” code snippet that loads the DaaS for CI data into BICS.
The high-level steps of the insert stored procedure are:
a) Loop through each URL/path in the data feed.
-> The DaaS for CI data feed is retrieved using the Enriched Social Data Feed API.
-> Each .gz file is read in as a blob using The Oracle Apex API APEX_WEB_SERVICE MAKE_REST_REQUEST_B function.
b) Open / Uncompress each .gz file -> to retrieve the XML file.
-> The UTL_COMPRESS Oracle PL/SQL database function is used to open / uncompress each .gz file
c) Map the XPath for each XML field.
-> The XMLTable Oracle PL/SQL database function is used to parse the XML for each field.
d) Data is inserted into a relational database table.
-> All fields are inserted into the SOCIAL_DATA_FEEDS table.
Click here for a text version of the insert records code snippet.
create or replace procedure SP_UNCOMPRESS_AND_INSERT(p_username varchar2,p_password varchar2,p_url varchar2)
IS
v_blob BLOB;
v_uncompress_blob BLOB;
v_xml XMLTYPE;
BEGIN
v_blob := apex_web_service.make_rest_request_b
(
p_url => p_url,
p_http_method => ‘GET’,
p_username => p_username,
p_password => p_password
);
v_uncompress_blob := utl_compress.lz_uncompress(v_blob);
INSERT INTO SOCIAL_DATA_FEEDS
SELECT
p_url,
m.pub_date,
m.description,
m.release,
m.count,
p.msg_id,
TO_DATE(SUBSTR(p.msg_published_on ,0,17), ‘YYYY-MM-DD HH24:MI:SS’) msg_published_on ,
p.msg_source,
p.msg_source_type,
p.msg_link,
p.msg_title,
–p.msg_body,
p.msg_source_id,
p.msg_auth_name,
p.msg_auth_followers,
p.msg_auth_friends,
p.msg_auth_klout,
p.msg_auth_src_id,
t.topic_id,
t.topic_name,
s.snippet_id,
s.snippet_readability,
s.snippet_subjectivity,
s.snippet_tonality,
s.snippet_anchor,
s.snippet_text,
d.dimension_id,
d.dimension_name
FROM
XMLTable(xmlnamespaces(‘http://www.collectiveintellect.com/schemas/messages’ as “ci” ),’/ci:messages’
passing xmltype(v_uncompress_blob, nls_charset_id(‘AL32UTF8’))
columns
pub_date varchar2(30) path ‘pub_date’,
description varchar2(40) path ‘description’,
release varchar2(80) path ‘release’,
count number path ‘count’,
post xmltype path ‘posts/post’
) (+) m ,
XMLTable( ‘/post’ passing m.post
columns
msg_id number path ‘message_id’,
msg_published_on varchar2(50) path ‘published_on’,
msg_source varchar2(90) path ‘source’,
msg_source_type varchar2(40) path ‘source_type’,
msg_link varchar2(120) path ‘link’,
msg_title varchar2(200) path ‘title’,
–msg_body varchar2(1000) path ‘body’,
msg_source_id varchar2(200) path ‘message_source_generated_id’,
msg_auth_name varchar2(100) path ‘author/name’,
msg_auth_followers number path ‘author/followers_count’,
msg_auth_friends number path ‘author/friends_count’,
msg_auth_klout number path ‘author/klout_score’,
msg_auth_src_id varchar2(200) path ‘author/source_generated_id’,
topic xmltype path ‘topics/topic’
) (+) p,
XMLTable( ‘/topic’ passing p.topic
columns
topic_id number path ‘@id’,
topic_name varchar2(50) path ‘@name’,
snippet xmltype path ‘snippets/snippet’
) (+) t,
XMLTable( ‘/snippet’ passing t.snippet
columns
snippet_id number path ‘id’,
snippet_readability varchar2(40) path ‘readability’,
snippet_subjectivity varchar2(40) path ‘subjectivity’,
snippet_tonality number path ‘tonality’,
snippet_anchor varchar2(40) path ‘anchor’,
snippet_text varchar2(1000) path ‘text’,
dim xmltype path ‘dimensions/dimension’
) (+) s,
XMLTable( ‘/dimension’ passing s.dim
columns
dimension_id number path ‘id’,
dimension_name varchar2(60) path ‘name’
) (+) d;
commit;
END;
Step Four – Get Data Feed Link
Replace the items highlighted in yellow with those from your environment.
a) Max Age = The age in seconds of the oldest file to be returned. The value can be up to 604800 (one week).
b) URL = Data Feed Link index.xml.
c) Username = API Key
d) Password = Customer XID
This code snippet reads through the Data Feed list of files and calls the SP_UNCOMPRESS_AND_INSERT stored procedure for each one.
Click here for a text version of the get data feed link code snippet.
CREATE OR REPLACE PROCEDURE RUN_CI_INSERT
IS
l_ws_response_clob CLOB;
l_ws_response_xml XMLTYPE;
l_link VARCHAR2(500);
l_max_days VARCHAR(100) := ‘259200’; –last 3 days
l_ws_url VARCHAR2(500) := ‘https://{CustomerKey:ApiKey}:@data.collectiveintellect.com/feeds/{Feed_ID}/index.xml?max_age=’ || l_max_days;
l_username VARCHAR2(100) := ‘API Key‘;
l_password VARCHAR2(100) := ‘Customer XID‘;
BEGIN
DELETE FROM SOCIAL_DATA_FEEDS;
l_ws_response_clob := apex_web_service.make_rest_request
(
p_url => l_ws_url,
p_http_method => ‘GET’,
p_username => l_username,
p_password => l_password
);
l_ws_response_xml := XMLTYPE.createXML(l_ws_response_clob);
FOR R IN
(
select (feed_link)
from
xmltable
(‘/batches/batch/url’
passing l_ws_response_xml
columns
feed_link VARCHAR2(500) path ‘text()’
)
)
LOOP
SP_UNCOMPRESS_AND_INSERT(l_username,l_password,R.feed_link);
END LOOP;
END;
Step Five – Run Job
Click here for a text version of the run job code snippets.
a) Create the job.
BEGIN
cloud_scheduler.create_job(
job_name => ‘LOAD_CI_DATA’,
job_type => ‘STORED_PROCEDURE’,
job_action => ‘RUN_CI_INSERT’,
start_date => ’01-MAR-16 07.00.00.000000 AM -05:00′,
repeat_interval => ‘FREQ=DAILY’,
enabled => TRUE,
comments => ‘Loads CI Data into SOCIAL_DATA_FEEDS’);
END;
b) Run the job.
BEGIN
CLOUD_SCHEDULER.RUN_JOB(JOB_NAME => ‘LOAD_CI_DATA’);
END;
c) Audit the job.
Check the progress of currently running jobs:
SELECT * FROM USER_SCHEDULER_RUNNING_JOBS
WHERE JOB_NAME = ‘LOAD_CI_DATA’;
Displays log information about job runs, job state changes, and job failures:
SELECT * FROM USER_SCHEDULER_JOB_LOG
WHERE JOB_NAME = ‘LOAD_CI_DATA’;
Displays detailed information about job runs, job state changes, and job failures:
SELECT * FROM USER_SCHEDULER_JOB_RUN_DETAILS
WHERE JOB_NAME = ‘LOAD_CI_DATA’;
Displays information about scheduled jobs:
SELECT * FROM USER_SCHEDULER_JOBS
WHERE JOB_NAME = ‘LOAD_CI_DATA’;
d) Disable / re-enable the job
BEGIN
CLOUD_SCHEDULER.DISABLE(‘LOAD_CI_DATA’);
END;
BEGIN
CLOUD_SCHEDULER.ENABLE(‘LOAD_CI_DATA’);
END;
Step Six – Review Data
SELECT * FROM SOCIAL_DATA_FEEDS;
Further Reading
Click here for more A-Team BICS Blogs.
Click here for the DaaS for CI documentation home page.
Click here for the Enriched Social Data Feed API documentation.
Click here for the Application Express API Reference Guide – MAKE_REST_REQUEST_B Function.
Summary
This article provided a set of sample code snippets that leverage the APEX_WEB_SERVICE_API to integrate Oracle Data as a Service for Customer Intelligence (DaaS for CI) with Oracle Business Intelligence Cloud Service (BICS) Schema Service Database.
The DaaS for CI Enriched Social Data Feed API was used to retrieve the data feed. The results from the data feed were then loaded into a BICS schema service database using PL/SQL and Apex functions.
The code snippets provided can be ran in any Oracle database configured with Apex (independently of BICS). With appropriate code adjustments the PL/SQL can be converted to run outside of Apex on an Oracle on-prem or DbaaS schema (i.e. using the UTL_HTTP package).
Themes and concepts covered in the article are also beneficial for those wanting to integrate DaaS for CI with other products using alternative programming languages.
All content listed on this page is the property of Oracle Corp. Redistribution not allowed without written permission