What's new in HALE 2.5

These are the descriptions of some of the more interesting or significant changes made to HALE for the 2.5.0 release since the last stable release 2.1.2. This includes the milestone releases 2.5.0.M1 and 2.5.0.M2 and the release candidate 2.5.0.RC1.

Have a look here to see how the look of HALE has changed since the previous stable release.

New in 2.5.0

Several bugfixes

Based on the feedback from the use of the release candidate we could identify and solve several bugs to make HALE ready for the stable release. For instance, the issues with the map view and the help browser on Linux/GTK have been resolved and the performance of the XML/GML encoding has been improved.

Headless transformation API

Transformation API for use in automated processes and in applications without User Interface. The headless transformation API is also used in the new HALE server application.

Export project as Zip archive

To share your HALE projects the best option is to save them as a HALE project archive. This includes all relevant files in one Zip file - you can even include resources like schemas, that are available through the web (or maybe only your internal intranet). To load such an archive, just load the Zip file as a project in HALE. A project archive can also is the best way to deploy your project to a HALE server.

2.5.0.RC1

User Interface

Validation of transformed instances

During the transformation, all new instances are validated against the constraints that come with the target schema. The results are presented in the validation report, highlighting any problems with missing or invalid property values, grouped by the property where they occur. Problematic instances can be inspected in the data view.

For more information, see Validate the transformed data.

Instance validation in data views

Validation information is also available in the data views. When using the Compare mode, a validation summary is displayed as tooltip. In the Explore mode, validation warnings are shown for the affected properties.

Instance population

If you have loaded a source data set, information about how the schema elements are populated in the data are displayed. The population states the overall count of instances of the respective type or values of the respective property. In the Schema Explorer you can filter unpopulated properties to only focus on those where values are present.

For more information, see the Schema elements reference.

Load pre-defined schemas

HALE now offers a selection of pre-defined schemas you can load. When importing a source or target schema just select From preset and then click on the selection field to show the list of pre-defined schemas. The tag Bundled marks a schema that is bundled with HALE and can be loaded without internet connection.

Currently the presets include CityGML schemas from citygml.org and the INSPIRE Annex I GML Application Schemas from the INSPIRE schema repository.

Compare only populated properties

In the data views there is a new variant of the Compare mode, which only displays properties populated in the loaded source data set or the live transformation result - this makes it easier to compare instances, especially with very complex schemas.

Transform external data

Up to now you had to load your data into HALE to execute a transformation and export it - but using source data to support in the mapping creation with HALE is best done with a manageable example. In the new Transformation menu you have the possibility to launch a dedicated transformation job based on the current alignment and schemas, to transform bigger data sets.

For more information, see Transform external data.

Show cells associated to a schema element

If you want to see or edit the mapping cells associated with a specific schema element you can use the new Mapping View. It displays the cells associated to the schema element selected in the Schema Explorer. Like in the Alignment View, you can use the context menu to edit or delete a mapping cell.

Replace a mapping cell

Replacing a mapping cell up to now was a two-step task - deleting the old relation and then creating a new one. This can be tedious if you are working with a source data set and the transformation is run in between. Now for property relations you have the possibility to do it in one step. For replacement, every property function is offered, not restricted to those applicable to the properties involved in the relation you want to delete.

Select specific values in Compare mode in data views

When using the Compare mode in a data view, you now can select a specific value for a property if there are multiple. So if a value is annotated with something like (1 of 7) you can hover over the corresponding table cell and select a different value (e.g. the second, third, etc.) from the combo box that appears.

Generate sequential identifiers

A new function has been added that allows generating sequential identifiers. You can specify a prefix and suffix as desired or needed to create a valid value for the target property.

Specify missing SRS information

If the spatial reference system can't be determined for a geometry when loading a Shapefile or GML file, HALE now asks you to specify the SRS to use, either through an EPSG code or a WKT string.

Under the Hood

Java bundled with HALE on Windows and Linux

No need to install Java yourself - HALE includes its own version of Java it runs with. If you want to use your local version of Java, just delete the jre folder in the HALE directory.

GZiped XML

HALE now supports directly reading, writing and validating gzip compressed XML or GML files.

Improved Shapefile data import

When loading data from a Shapefile, the contained features are streamed instead of loaded completely into memory. This allows loading big Shapefiles or using them in the external data transformation.

Improved internal database storage

The space the internal database uses has been drastically reduced. The internal database serves for temporarily storing source and transformed instances.

Load HALE 2.1 projects

Projects created in HALE up to version 2.1.x can now be imported again, though with some limitations:

Previously defined filters are not applied
The NilReasonFunction is converted to an assignment
The BoundingBoxFunction is not supported
The ClipByRectangleFunction is not supported
Problems can arise if multiple properties with the same local name but different namespace exist in a type and were mapped.

Please take a look at the reports after importing such a project, to identify if and where there might be problems.

2.5.0.M2

User Interface

Integrated example projects

Example projects can now be integrated and shipped with HALE. For now it comes with some simple examples to explain basic functionalities and two examples based on a geographic data set from Ordnance Survey - one with a Retype, the other with a Merge type relation.

Enable/disable live transformation

The transformation of the source data you loaded into HALE is triggered on every change to the mapping. Sometimes this may hinder your workflow, especially when you are dealing with a big data set. Now you can enable or disable the transformation at your convenience.

Map tooltips

Quickly identify instances in the map through the tooltip when hovering over it. Press c to copy the tooltip content to the clipboard.

Support styling of point geometries

The geometry renderer in the new map view so far missed the capability to render custom styles for point geometries. Now you can use the SLD editor to assign different markers, graphics or your custom SLD to your point objects.

Under the Hood

Improved geometry support for reading XML/GML

Many more GML geometry types are now supported when reading XML files. Please have a look at the XML Data Import for a detailed listing.

Undo/redo support also for contexts

Creating or removing a context on a schema element can now be undone.

2.5.0.M1

User Interface

Welcome page & Getting started guide

When first starting HALE you will be greeted by a Welcome page which quickly gets you going on working on an existing project or starting a new one. The guide gets you started on the basic workflow when working with HALE.

Integrated user guide

The integrated help can be launched through the toolbar or the Help menu - it includes a user guide for HALE, describing the main workflow, the different perspectives and views and explains how to perform certain tasks and the concepts behind them.

Context-sensitive help

Pressing F1 will open a view with help topics that are related to your active view and possibly even for your current selection. For example, with a cell selected in the alignment view, the help will provide you with the link to the documentation of the related transformation function.

Reports on performed actions

Actions like loading or saving data, loading schemas, performing the transformation or exporting the alignment produce a detailed report. Select a report in the Report List view to access eventual warnings and errors that occurred during the process in the Properties view.

Inspect selection in Properties view

Detailed information on a selected object is now displayed in the Properties view - currently this works for schema elements, alignment cells, functions and reports. Programmers can easily extend it with additional sections.

Alignment view

The Mapping Graph view has been reworked to better integrate with the rest of the application and is now called Alignment view. It renders the old mapping view obsolete, as editing and deleting cells can now be performed on the alignment view selection, and details on the selected cell are displayed in the Properties view.

For more information, please have a look at the Alignment view documentation.

New map view

The map view has been replaced by a new component that is able to put the data into context by displaying it on a background map. By default, OpenStreetMap is used as the background map. Instances in the map can be selected and their style adapted with an integrated editor.

For more information, please have a look at the Map view documentation.

Merge instances of the same type

A new type relation has been added, that allows to merge multiple source instances of the same type, based on an equal property value. The other properties are either merged also, or all values of the property are available in the merged instance.

Type Hierarchy view

The new Type Hierarchy view can be used to inspect the hierarchy of a specific schema type. Select a schema element, e.g. in the Schema Explorer, and the associated type and its super and sub-types will be displayed. You can focus on another type in the hierarchy by double clicking on it.

Functions view

Get an overview of the available transformation functions. Detailed information on a selected function can be obtained either through the Properties view or the contextual help (pressing F1).

Support for CSV files

Comma Separated Value (CSV) files can now be used as source schema and data. For the schema, column names can either be used as available in the first row of the file, or specified manually.

Mapping relevant types

Not all typed defined in a schema are usually relevant for a mapping. Now you can customize which types are relevant for your mapping in the source and target schemas - thus, only those types are displayed in the Schema Explorer, and only data instances of these types are loaded when loading source data.

Create contexts on schema elements

A context on a schema element may represent a condition applied to it, an element at a certain index or just an additional instance of a target property that may be populated. Contexts can be created for a selected schema element through the context menu in the Schema Explorer, if applicable.

For more information, please have a look at the more detailed Contexts documentation.

Filters on nested properties

Filters for use in the data views and condition contexts have been improved to support specifying nested properties. Use the Insert attribute name button to choose a schema element and insert the corresponding name.

For more information, please have a look at the documentation on CQL Filters.

Cell explanations

It is not always easy to grasp how a transformation function works and what effect a defined relation has. On a selected mapping cell you now get an explanation (displayed in the Properties view) that expresses the relation in a more human understandable way.

Improved HTML mapping documentation

The export of your mapping as a HTML file presents you with a mapping documentation you can easily share. First you get an overview of all type relations, then for each type relation the relevant property relations, including their explanation.

Undo/Redo support

When creating your mapping you now can undo or redo the performed steps, like adding, replacing or deleting cells.

Structural rename

The Rename relation now supports copying a complex structured property, given that the structure of the target property is similar.

Generic function wizard

Instead of letting each transformation function provide its own wizard for it to be configurable through the user interface, setting the source and target entities and specifying parameters is done through the generic function wizard. This makes it a lot easier to extend HALE with new transformation functions. If desired, functions can still provide their own wizard pages for the parameter configuration.

Under the Hood

New application architecture

HALE has undergone a major restructuring of the whole application, motivated by the following goals:

Simple but flexible and extendable models for schema, data and alignment
Modular architecture that allows the use in different environments (e.g. UI and Server)
Common infrastructure for all I/O operations
Support transformation of large data sets
Allow to easily extend HALE with additional functionality

There are several extension points allowing HALE to be extended by Plug-ins. Apart from the general UI extension points provided by Eclipse RCP, which for instance allow you to add your own views, these are some of the functionalities you can extend:

Transformation function definitions and implementations
Converters for automatic value conversion
Import and export formats for schema, data, alignment etc.
Map tools, overlays, layouts and maps
Instance representations in data views
include online resources for offline use

Internal database based on OrientDB

Data now is stored in a temporary internal database instead of being held in the memory all the time. This allows to deal with bigger data sets, as only those instances actually needed are retrieved from the database. Transformation can be performed in a streaming like process, where loading data, transformation and storing the transformation result are done in parallel.

Multiple schemas as source or target

You can combine multiple schemas to form your source and target schemas, for example an XML Schema together with a Shapefile and a CSV file could form your source schema. Currently there is one exception to this, you can't load multiple XML Schemas into source or target. This is to prevent namespace incompatibilities and duplicated schema elements - if you need to use multiple XML Schemas, you can use the workaround as described here.

Streaming XML and GML

Generic XML and GML reader and writer that support streaming, i.e. they read or write the data instance by instance instead of all at once.

Conversion service and automated conversion

A conversion service is provided as OSGi service in the application and can be used for value conversions and extended with custom converters. In addition, transformation function results are automatically converted to the corresponding target property type (if the function does not prohibit this).

New transformation service implementation

The Conceptual Schema Transformer (CST) has been reimplemented based on the new models. It is the default transformation service implementation and uses a transformation tree to handle the transformation of complex structures with varying cardinalities.

Bundle schemas with HALE

Plugins may extend HALE to provide schemas for offline use. Some well-known schemas are already bundled with HALE, e.g. the GML 3.2.1 schema, to speed up schema loading.