2015-05-26

JSON for Relational Data in DB2 for z/OS?!

JSON for Relational Data in DB2 for z/OS?!

Last year I wrote an article on JSON and the impact for DB2 DBAs for the IDUG Content Library (see  http://www.idug.org/p/do/sd/sid=6779&type=0). In this article I advocated the advent of JSON views on relational data.

In my article I referred to options with JSONx, but the technology appears to be outdated and new functions have risen since its publication. In a developerWorks article of Jane Man and Jae Lee (http://www.ibm.com/developerworks/data/library/techarticle/dm-1501sql-json-db2/index.html), the JSON SQL interfaces for DB2 11 are described.

These new functions will allow you to to create a poor man's JSON replication of your relational data. I name it poor man's, as it will obviously not permit you to do realtime projections of your relational data, but will allow incremental refreshes of JSON documents using SQL INSERT, UPDATE, DELETE statements.

As JSON is a simple hierarchical structure, it can easily be mapped with the more complex relational structures on the one hand and the object structure of an object-oriented language on the other hand. I saw in this an opportunity to get rid of Object Relational Mapping software that often result in heavily consuming SQL.

Instead of leaving the data mapping to the application programmer, it could be the data architect that could identify the data structure in the database and provide the mapping. With data architect, I refer to a person with  knowledge of the business, the functional processes and the data structure. This role could build a relational model and map it adequately to an hierarchical model.

As a developer is not supposed to know relational theory in depth, the data architect could make use of his specialization and apply it to the database he is specialized in. For us this would be DB2 for z/OS.

In a way I am pleading to implement a few things for JSON in DB2:
  • First of all, the notion of JSON schemas should appear. MongoDB has a methodology to enforce mandatory structures upon JSON data (you cannot control the non-mandatory in JSON as you can in XML), this is what is known as JSON schemas. In DB2 this has not been available insofar.
    These schemas could allow one to create JSON projections of relational data in DB2 for z/OS as you can with XML projections in SQL Server (https://technet.microsoft.com/en-us/library/aa258637%28v=sql.80%29.aspx). JSON schemas should only be applied to those JSON documents that relate to relational data in order to guarantee performance.
  • Next, these JSON projections should be accompanied with concurrency checks, they shouldn't provide unlimited update/insert/delete options, as the underlying data remains relational. As such, the JSON projections would only hold versions of a record if they relate to temporal tables or tables with a history table involved.
  • Finally, the JSON schemas should be immediately discoverable by any JDBC driver compatible with JSON, thus that the build of an object model mapping is no longer required, as a usable hierarchical structure is already available from within the database.
The translation of relational data to JSON would also allow imports of this data into Big Data systems like Hadoop or other JSON stores without the need of direct connectivity. From this perspective DB2, and especially DB2 for z/OS, could play the role as a central data repository that could push enterprise data to applications running in the outer domains of the secure network, without the need to open the mainframe for internet access and risking hacks on your most valuable information. 

If IBM would consider these changes, the DBA community could continue to leverage its knowledge and expand it to the application development, even in the cloud. This vision would suit within the spirit of DevOps where the DBA would become one of the project team members closely working with the business and functional analysts.

Next to DB2, there are other interfaces that you could use to address relational data and JSON. Have a look at BigSQL, stinger.next, ... There is a lot of interesting material, but I'll keep this for another blog :-)

Best regards,

Ludovic

2015-02-02

The Usefulness of Java/.NET Impact Analysis
on DB2 for z/OS

 

Introduction

The past few months I was reflecting on the following subject: "how can we keep track of the origin of Java and .NET queries on DB2 for z/OS?" I discussed the subject with several people and came to the conclusion that the answer was somehow different than expected.

First of all, I had to reconsider the question itself, although the theme will be raised in many mainframe enterprises across the globe in the same way. 

We, as mainframe DBA, are used to ways of checking the impact of a database change in a PL/I, C or Cobol program. All three languages that typically make use of static SQL and create packages for which the impact analysis is fairly simple as embedded in the package implementation. For dynamic SQL, we had the tradition to keep the amount of it restricted, thus that we had full control over it. Up to recently very few dynamic SQL occurred outside controlled reporting tools or REXX programs for which we could easily read the objects being used from the source code.

This changed drastically in the past 10 years. Nowadays, the IBM Universal JDBC driver and data server driver allow direct connections to DB2 for z/OS from almost any application that is granted authority on the database. Java and .NET programs are now the major source of dynamic SQL. This dynamic SQL comes in huge amounts and is not controlled in many enterprises.

 

Application Domains

As application DBA one has to consider the impact of a database change on an application. As our traditional tools do not give the information we had for static SQL, for dynamic SQL, we had to rely on the experience of the application developer. Hence the question rose, whether we couldn't identify the programs (classes/packages) that make usage of these database objects as we did for static SQL?

The question was asked to both Java and .NET developers and they all gave clues on how to identify database objects in their source code, but none of the solutions proofed to be concise and all were related to the technologies being used to call the database.

Knowing this, we had to change our focus. No longer we had to search for a way to retrieve the objects from the source code. No, we had to focus on identifying the origin of a query starting from DB2 and collect all queries coming from this source to define the domain of a given application.

This revealed the magic word 'DOMAIN'. Although Domain Driven Design is de facto a standard on many platforms, DB2 for z/OS hasn't been affected mostly, as the mainframe was considered as being the 'enterprise domain' and holds in many places a single schema or only a few very large ones referring technical domains rather than application domains. With the advent of large quantities of dynamic SQL on DB2 for z/OS, the classic schema setup should be reconsidered.

The schema setup should depend on how DB2 for z/OS is addressed. If DB2 for z/OS is considered as data server feeding a Enterprise Data Bus (using stored procedures), the classic setup may continue to exist, as the outside API is controlled by native objects. 

However, when JDBC/ODBC applications are allowed to directly select data from DB2 for z/OS, the organization of DB2 should be adapted to the respective application domains, e.g. by implementing a schema per application domain and foresee views when cross-domain reads are required. Thus all objects that are being used by an application are within a given DB2 schema, as they are all part of that application domain. Unsurprisingly, this kind of setup simplifies the impact analysis for static SQL too and allows a uniform database approach whatever the underlying database.

But what about the exact packages/classes being impacted by a change? How can we communicate to the development that they need to adapt one or the other process? For now we can only tell them that their application is somewhere impacted? This leads us to a second aspect of this story: SQL monitoring.

 

SQL Monitoring and Object Analysis

For Dynamic SQL, monitoring is vital. As coding becomes more and more complex, one can no longer keep up with the technical changes that are linked to the wide variety of coding that is used to address ODBC or JDBC. As DB2 for z/OS DBA we should be able to focus (mainly, not exclusively!) on the database software and not on the development procedures. Hence we should capture the dynamic SQL statements from the dynamic statement cache and store them following a few principles.

In order to capture dynamic SQL, IFCID 318/316 should be activated along with SMF Compression (zParm SMFCOMP). This will enable dynamic statement cache explain and dynamic SQL monitoring. Alternatively, one could make use of a SQL Monitor such as IBM Query Monitor or CA Detector, but these tend to consume more CPU than trace based monitoring.

One way to capture dynamic statements is to explain the statements cache regularly. This could work, but ignores the possibility that your statement cache was flushed between the explain intervals, thus denying you access to part of the statements being issued.

Another way to capture dynamic statements is to make usage of monitoring tools with historical monitoring, such as IBM's SQL end-to-end monitoring feature from Omegamon for DB2 Performance Expert that copies near-term history data to a DB2 LUW repository in which you can keep the volatile performance data for a longer period. Similar options exist from other vendors, but I did not have the chance to learn them yet.

Once the statements are stored, the SQL syntax could be analyzed to extract the database objects being used. Along with the client information (to be set in the data source or the connection; eg. for JDBC see the client* properties) this allows you to build an impact analysis tool. This provides you a complete view on the impact of your changes using a granularity that is controlled by the developers themselves.

Conclusion

Forget about 'Database Impact Analysis for DB2 for z/OS .NET/Java programs' and welcome the idea of 'Database Impact Analysis for dynamic SQL'. This blog described a concept that could be used to create a database object usage overview for all dynamic SQL, not only the one used in Java and .NET. 

The issues related to the maintenance of dynamic SQL are clearly not restricted to the queries themselves, but relate also to the structure and the usage of the database itself. As DB2 for z/OS DBAs we should learn the principles of Service-Oriented Architecture and Domain Driven Design as more and more we will be confronted with the reasons why these theories emerged and are still relevant after several decades.

'If you can't beat them, join them', you might think, but there is nothing to beat. Along with us, developers try to provide high quality end products and we can only help them if we understand what they are doing and if they understand what we are doing and especially why it is important.

Getting an insight on how database objects are used is clearly an advantage in this task and the value has been proven over the past 30 years of DB2 for z/OS.