Data Services by Erix.: June 2008

Wednesday, June 25, 2008

Seen on eBizq, this fundamental article from Oracle's Jeff Pollock: The Case for Enterprise Data Services in SOA.

The Case for Enterprise Data Services in SOA

Seen on eBizq, this fundamental article from Oracle's Jeff Pollock: The Case for Enterprise Data Services in SOA.

Data Management for SOA

Seen on EDS blogs: Data Management for SOA.

They wrote somewhere: '... Jill Dyche asserts that "SOA Starts with Data". She advocates creating data services-creating data hubs as services that manage and provide access to master data. Starting with data services has an appeal to IT organizations that feel the need to adopt SOA ...'

This sounds like music to my ears.

I also like the conclusion: "Data management for SOA should be approached as requiring an enterprise logical data model, mechanisms for federation and sharing of data among relatively autonomous service units, and a data management plan that defines responsibilities, flows, master data stores, latency of updates, synchronization strategies and accountability for data integrity and protection. This plan must align with the organizational responsibilities of service units and their data needs, and it must ultimately support an integrated representation of the state of the enterprise-history, current state and future plans."

See also Jean-Jacques Dubray's reaction on InfoQ: Enterprise Data Management, the 3rd face of the SOA/BPM coin?

Enterprise Data Management on Wikipedia.

The Enterprise Data Management Council Web site.

Data Management for SOA

Tuesday, June 24, 2008

Transactions on the Web

Interesting interview of Mark Little (JBoss).

Transactions on the Web

Interesting interview of Mark Little (JBoss).

Entity Framework v2 transparent design

They will tell you what they are thinking about and you can even give your feedback ==> http://blogs.msdn.com/efdesign/default.aspx

Some ideas for V2:

Persistence Ignorance : We are looking at ways to introduce a full POCO solution for state management and interaction with the ObjectContext.

N-Tier Support : Today we support Data Contract serialization of entities or exposing entities via Astoria, in V2 we would like to expand to a DataSet like experience where one can remote graphs and changes to the graphs across the wire using standard WCF services.

Code-First : We want to enable a convention based, code only experience with EF wherein one may start with classes and opt-in to database generation and deployment. We expect that we would provide attributes and external mapping capabilities for people who wanted something beyond the convention based mapping.

TDD Scenarios: With the introduction of POCO classes some of the TDD scenarios get a lot easier, and we are looking at incorporating some other asks to better fill out the scenario, such as making our ObjectQuery<T> and other generated members of our context and classes virtual.

FK's : Today we support bi-directional relationships, and we are looking at introducing model concepts to facilitate the definition of FK like experiences in the model or in one's POCO classes.

Lazy Loading: Today we support explicit lazy loading (calling .Load), and we are looking at various options around LoadOptions as well as outright implicit lazy loading.

Query Tree Re-Writing: This allows framework developers to contextualy, vertically and horizontally filter query results.

See also Danny Simmon's blog, ADO.Net blog and the advisory council.

Entity Framework v2 transparent design

They will tell you what they are thinking about and you can even give your feedback ==> http://blogs.msdn.com/efdesign/default.aspx

Some ideas for V2:

Persistence Ignorance : We are looking at ways to introduce a full POCO solution for state management and interaction with the ObjectContext.

N-Tier Support : Today we support Data Contract serialization of entities or exposing entities via Astoria, in V2 we would like to expand to a DataSet like experience where one can remote graphs and changes to the graphs across the wire using standard WCF services.

Code-First : We want to enable a convention based, code only experience with EF wherein one may start with classes and opt-in to database generation and deployment. We expect that we would provide attributes and external mapping capabilities for people who wanted something beyond the convention based mapping.

TDD Scenarios: With the introduction of POCO classes some of the TDD scenarios get a lot easier, and we are looking at incorporating some other asks to better fill out the scenario, such as making our ObjectQuery<T> and other generated members of our context and classes virtual.

FK's : Today we support bi-directional relationships, and we are looking at introducing model concepts to facilitate the definition of FK like experiences in the model or in one's POCO classes.

Lazy Loading: Today we support explicit lazy loading (calling .Load), and we are looking at various options around LoadOptions as well as outright implicit lazy loading.

Query Tree Re-Writing: This allows framework developers to contextualy, vertically and horizontally filter query results.

See also Danny Simmon's blog, ADO.Net blog and the advisory council.

RAM is the new disk!

Seen this interesting post seen on InfoQ, also relayed on Nati Shalom's blog (Gigaspaces).

This raises some comments:

No discussion there is a need for in-memory databases.
- RAM and network evolutions are changing the database space. And maybe the impact of network evolution is even more important than the RAM.
  - RAM disks exist for a long time in OS.
  - USB keys are a kind of disk on RAM.
  - The notion of keeping information alive once the box is stopped is important. Yes, at the end the disk could be mostly used for archiving rather than for storing.
- Most current database technologies are cluttered by disk access APIs, this also includes db4o, HSQLDB and the like.
  - That said, as opposed to what Nati said, some advanced database technologies (like Oracle and Versant for instance) are able to bypass OS stream-oriented disk APIs and can directly manage the disk space.
  - Having caches in database engines will not olve the problem, this remembers me the first white papers from TimesTen, 12 years ago.

New technologies won't replace existing ones, they complement them.
- 15 years ago, some were predicting the death of mainframes... they are still predominant.
- Disk technologies can still be improved, see Cameron Purdy's comment for instance.
- Disks have seen the most impressive progression among the various computers components since 20 years.

In-memory data grids (IMDG) won't eliminate the need for ORM (and Universal mapping, when extended to non-relational data stores and non-object consumers).
- They just put it in a different place, in an intermediate box.

This all leads us to the notion of a Data Services Platform. Which includes a cache, but is not limited to a cache. The Data Access Layer will become even more important than the database itself which will become the storage layer.

RAM is the new disk!

Seen this interesting post seen on InfoQ, also relayed on Nati Shalom's blog (Gigaspaces).

This raises some comments:

No discussion there is a need for in-memory databases.
- RAM and network evolutions are changing the database space. And maybe the impact of network evolution is even more important than the RAM.
  - RAM disks exist for a long time in OS.
  - USB keys are a kind of disk on RAM.
  - The notion of keeping information alive once the box is stopped is important. Yes, at the end the disk could be mostly used for archiving rather than for storing.
- Most current database technologies are cluttered by disk access APIs, this also includes db4o, HSQLDB and the like.
  - That said, as opposed to what Nati said, some advanced database technologies (like Oracle and Versant for instance) are able to bypass OS stream-oriented disk APIs and can directly manage the disk space.
  - Having caches in database engines will not olve the problem, this remembers me the first white papers from TimesTen, 12 years ago.

New technologies won't replace existing ones, they complement them.
- 15 years ago, some were predicting the death of mainframes... they are still predominant.
- Disk technologies can still be improved, see Cameron Purdy's comment for instance.
- Disks have seen the most impressive progression among the various computers components since 20 years.

In-memory data grids (IMDG) won't eliminate the need for ORM (and Universal mapping, when extended to non-relational data stores and non-object consumers).
- They just put it in a different place, in an intermediate box.

Monday, June 23, 2008

Why old DB optimizers cannot deal with the cloud

First article about query optimizers: http://www.databasecolumn.com/2008/06/designing-systems-for-the-grid.html. I hope the following articles will give us much more information! That optimization problem is really interesting. We know it is a NP-complete king of problem. And it is even more interesting when dealing with multiple data sources. That's one of the challenges of a modern Data Services Platform.

Why old DB optimizers cannot deal with the cloud

EDM tooling

Seen on the ADO.Net blog this series of articles about the tooling for EDM. I especially like the fact the tools are available as wizards within Visual Studio but also as command-line scripts and APIs. That's pretty cool, convenient and complete.

EDM tooling

Wednesday, June 18, 2008

Perst 3.0

In the renewal of the ODBMS market it seems embedded applications is one of the best niches. McObject announces a Perst version 3.0, with support for both Java and .Net (with the LINQ query language).

http://www.infoq.com/news/2008/06/persist-v3

Perst 3.0

Monday, June 16, 2008

Gemstone, Smalltalk & Ruby

Interesting blog from Avi Bryant about Gemstone, Smalltalk and Ruby (Project MagLev).

For sure, dynamic languages are the best solution for persistence. It seems Smalltalk still rocks. Interesting to see that Ruby-Smalltalk coexistence in the same JVM. I really would like to see something like that on Groovy.

See also the related InfoQ article, with links to DabbleDB (yet another online database on the Web) and Seaside project (Web application framework in Smalltalk).

Interview of Avi Bryant about Maglev.

Video from the conf at QCon 2008 London.

Inteview of Bob Walker about Maglev.

Movie of a Dabble DB demo. (with some nice features like input and filters using some smart date filters, and dynamic schema evolution).

Gemstone, Smalltalk & Ruby

Testing data oriented applications

In this paper, IBM introduces the challenge of testing applications manipulating data. The problem is even more complex when multiple heterogeneous data sources are involved. Optim is the IBM solution for data testing.

NB: You must first register before downloading this 20 pages white paper.

Testing data oriented applications

Bill Gates about mapping

Even Bill Gates himself now mentions Data Access and Mapping technologies in his keynote session at TechEd 2008. Actually at around 59:38 David Campbell starts to speak about the Entity Framework and the Entity Data Model.

I think that's a significant achievement for all of us working on data access, mapping, persistence, etc. since so many year. SQL Server Data Services is also mentioned, just before.

Bill Gates about mapping

Entity Framework Mapping Resource

Found this new blog from Ju-Yi Kou, providing interesting resources and examples about the mapping features of the Entity Framework.

See also: Mapping 101.

Entity Framework Mapping Resource

Found this new blog from Ju-Yi Kou, providing interesting resources and examples about the mapping features of the Entity Framework.

See also: Mapping 101.

Wednesday, June 11, 2008

New databases in the cloud

Nothing really new in this new post from the Database Column.

As a summary if you don't want to dig into it:

...

Recent DBMS innovations make this a reality today, and the best cloud DBMS architectures will include:

Shared-nothing, massively parallel processing (MPP) architecture.

Automatic high availability.

Ultra-high performance.

Aggressive compression.

Standards-based connectivity.

In summary, cloud databases with the architectural characteristics described above will be able to not just run in the cloud, but thrive there by:

"Scaling out," as the cloud itself does

Running fast without high-end or custom hardware

Providing high availability in a fluid computing environment

Minimizing data storage, transfer, and CPU utilization (to keep cloud computing fees low)

New databases in the cloud

Shared-nothing, massively parallel processing (MPP) architecture.

Automatic high availability.

Ultra-high performance.

Aggressive compression.

Standards-based connectivity.

In summary, cloud databases with the architectural characteristics described above will be able to not just run in the cloud, but thrive there by:

"Scaling out," as the cloud itself does

Running fast without high-end or custom hardware

Providing high availability in a fluid computing environment

Minimizing data storage, transfer, and CPU utilization (to keep cloud computing fees low)

Microsoft introduces Velocity, a distributed cache

Announced on the MSN data blog, Velocity, a distributed caching technology.

Such technologies exist since quite a long time in the Java world, it now has its counterpart in .Net.

That is a project to follow, here for instance.

Data Services by Erix.

Wednesday, June 25, 2008

SOA & data management: Understanding the data service layer

SOA & data management: Understanding the data service layer

The Case for Enterprise Data Services in SOA

The Case for Enterprise Data Services in SOA

Data Management for SOA

Data Management for SOA

Tuesday, June 24, 2008

Transactions on the Web

Transactions on the Web

Entity Framework v2 transparent design

Entity Framework v2 transparent design

RAM is the new disk!

RAM is the new disk!

Monday, June 23, 2008

Why old DB optimizers cannot deal with the cloud

Why old DB optimizers cannot deal with the cloud

EDM tooling

EDM tooling

Wednesday, June 18, 2008

Perst 3.0

Perst 3.0

Monday, June 16, 2008

Gemstone, Smalltalk & Ruby

Gemstone, Smalltalk & Ruby

Testing data oriented applications

Testing data oriented applications

Bill Gates about mapping

Bill Gates about mapping

Entity Framework Mapping Resource

Entity Framework Mapping Resource

Wednesday, June 11, 2008

New databases in the cloud

New databases in the cloud

Microsoft introduces Velocity, a distributed cache

Microsoft introduces Velocity, a distributed cache

Blog Archive

About Erix