Firebirds compatibility with the real world

Created by: Lester Caine, Last modification: 13 May 2018 (11:41 UTC)

This started as an email, but I realsied that I needed to expand on various point as I go, so this is the base version and I can add detail pages on various points. Ideal timestamp data identifies my ideal timestamp.

Many of the current standards do not always take real world situations into account, or even define basic limitations they have assumed. Blindly following one standard, such as the SQL ones misses the fact often that these were written before other standards that provide the essential data. Also at that time computers only talked in English so there was no need to worry about other alphabets. Today native English speakers and Christians are a minority and the majority expect their computers to work with their standards and at the very least allow them to enter data in their local language. So UTC8 tends to be a basic requirement of any data system and all the problems that introduces, yet the SQL standards are still essentially 'English' as is the case with most programming languages and international standards like the RFC system. PHP has even given up using anything but ASCII for code, although UTF8 variable names work most of the time if not documented and is another can of worms. We supposedly have a soultion to international names, but does it work everywhere?

Currently the Firebird calendering system is very limited in what it can actually support. Many things date back to a time when it was easier to assume 1970 was the start of the modern world, so Firebird's adoption of 1AD on a Christian calendar was at least progress at that time, but just what calendar does it support 2.5 Language ref only says 01/01/0001 to 31/01/9999 and no mention if the gregorian 100 year fix is observed, but having just tested things out, it looks like it is a gregorian calendar. Just which country used which calendar is variable upto as recently as 1922 for Greece so one can't have greek date 1900/02/29. So we can't use date offsets prior to the adoption of the gregorian calender. (https://www.timeanddate.com/calendar/gregorian-calendar.html)

Adding translations to support local users is something that databases are much better at than hard coded archives like those provided by ICU and other data sources. Although on line resources like geonames.org is now providing a fairly comprehensive cross reference for everything location wise. Providing a local timezone against a location is one of those facilities (you have to look up lat and lng first) but as with most of this material, there is no detail on what version of TZ is being used or more important these days if the location is in the same timezone as it was even last year! Also what needs to be confirmed is just what accuracy values are limited to. Being supplied as JSON data, the java manual is referenced, so Class TimeZone is used. This defines basic offsets in milliseconds, but only specifies hours and minutes in the definition. It does also highlight that legacy abbreviations are deprecated since they are not unique identifiers.

ICU is assumed to be current base to internationalize material and it does provide a tidy starting point. The main problem is that it still assumes that the current set of data it provides is the only valid set, but at least for timezone material, this changes more frequently than the versions of it's compiled libraries. So it NEEDS to mange better the version of data being returned! While developing the RFC7808 Time Zone Data Distribution Service there was one school of thought that we only ever needed the current view of timezone information, and that is fine for a large part of the world, but nothing is static and politicians like to put their mark on the calendar, so changing their local clock to some new offset is all too common. Without some indication that today's rules have changed it is difficult to know that the meeting you have panned on the other side of the world is going to happen at the time YOU have it booked for. OK the chance of a problem is small so the design rule that 'it does not matter' can be applied? It SHOULD matter.

ICU is only processing data that is provided by third party's, and in the case of time zones this is the TZ database provided by IANA. I've provided that like since it does now highlight the main problem with TZ as a source of timezone data. Current defaults only provide correct data post 1970 for reasons that are documented elsewhere. The result is that while many timezone identifiers produce accurate pre-1970 rules, a number are simply aliases to a rule set that matches post 1970. Again, it is often not identified that the ident IS an alias so the pre-1970 rules may not be correct. RFC7808 may have been developed by IANA and one would prehaps expect it to be based on TZ, but the limiations of TZ were recognised from the start, so in the usual committe compromise is that nothing is defined except the mechanism. So a new time zone data publisher can set up a tzdist service with their own set of time zone identifiers! Note also, it's time zone with a space ... 

While there are a few private TZdist services, IANA has not commited to setting up as a publisher to provide TZ via the service, but has recently started a discussion on other holes in the process. The mail list joining form is all that exists yet for tzdist-bis.