Microsoft Year 2000 Readiness Disclosure
& Resource Center |
|
|
|
Preparing Office Solutions for the Year
2000 |
|
The Generic
Problem
Although
there is a general awareness of the Year 2000 problem some
developers do not understand it or its significance. The problem
arises from the way certain, especially older applications store and
use data that the programmer or user of the system interprets as
dates. (For a complete discussion of the problem see the Microsoft
Year 2000 Resource Center (http://www.microsoft.com/y2k/).
Specifically whenever a product stores or accepts input of
data information using only 2 digits to represent the year it puts
the user at risk of data loss. In early computer programs that were
not last as long as the have the concept of a 2-digit year was
perfectly plausible. Two digit years were always assumed to occur in
the 20th century. For example, an order entry system designed in
1970 could assume that a date entered as 1/1/75 was January 1st,
1975, not 2075. As the new millennium approached, however, it became
clear that only storing 2-digits would be enough if a computer
system was to effectively accept, store, and display dates from both
the 20th and 21st centuries.
The Year
2000 problem can be further decomposed into two main parts:
- The storage problem
- The interpretation problem
- The system problem
The Storage Problem
The
first Year 2000 problem is one of storage. On
many older systems application developers worked hard to minimize
the storage space required for data. After all, disk storage was
very expensive in the early days of computing. A common practice was
to set up date fields with only two digits for the year.
While
this practice may seem shortsighted, remember that in previous years
the cost of fixed storage was much higher than in today's market.
This, coupled with the traditional developer's goal of making the
most efficient system possible, leads to a situation where many
legacy systems simply cannot store year data beyond this century.
And fixing this part of the problem requires not only restructuring
the data to accommodate four-digit years, but rewriting and
re-testing all of the business and program logic associated with
those fields.
The Storage Problem and the Desktop
So how
does the storage problem affect desktop developers? It depends on
the type of data that you are working with. If every database or
spreadsheet you work with uses formal date types for fields, instead
of representing dates with text strings for instance, the storage
issue isn't a problem. Database programs such as Microsoft Access
(which shares the Microsoft Jet Database Engine with other Microsoft
Office programs) handle dates correctly. That is, they always store
all four digits of the year. So, if your database engine always
stores four digit years, what's the problem? In a nutshell, the
problem occurs when data comes into your application, either through
human data-entry, or by links to external data. This data is often
stored using a string format, and there is often only a two digit
year.
Example Scenarios
Think of
a mainframe database that stores credit card member information. The
database has three date related fields: a membership date, an
expiration date, and an anniversary date. Now assume that part of
your application needs to bring this data in, modify it, and provide
a series of forecasting reports to management.
Scenario 1: Probable Date
The
first field, called membership date, signifies the date that the
customer became a cardholder. The contents of this field equates to
a 6 character text field that holds data like "120395". Such a value
is supposed to mean December 3rd, 1995 and "sort of looks like a
date".
But when
you import the data into your application, one of two things
happens:
- If the importing program allows you to specify that this field
is a date field, it should be able to parse the string text into a
formal date field. But the problem here is: what does "95" mean?
Does it mean 1995 or 2095? What happens next is based solely on
the rule your application's import logic uses.
- If the importing program doesn't handle conversion of text
fields to dates, you'll end up with a 6-character text field. This
can be even more troublesome. Because itÆs not readily
identifiable as a date field, it can be hard to spot and
restructure for Year 2000 compliance.
Scenario 2: Possible Date
Now
let's look at the second field in the mainframe database. This field
stores the member's card expiration date. And anybody who has ever
had a credit card knows that that expiration date is expressed as
four digits. For example, an expiration date of "12/97" means that
the card expires in December of 1997.
Now
imagine that the mainframe data field that stores this is either a
4-digit number, or a 4-character string. Either way, when you import
the data in, you are introducing potential Year 2000 issues. What
does the "97" mean? Is it 1997 or 2097?
Scenario 3: Unlikely Date
And now
we move onto the most insidious issue relating to data storageùthe
"hidden date" problem. Looking through the data in the third field,
anniversary date, you see values like 908, 201, 3E9, F28, and so on.
What you
probably don't know is that the programmer who designed the
mainframe storage model was running desperately low on disk space
when he was asked by the IT manager to add a new date field to hold
the cardmember's anniversary date. Our resourceful programmer
realized that he could stuff an entire date into three characters
using a simple compression algorithm.
Now you
are importing the data into your system on a regular basis. But how
do you know what this means? More importantly, when your application
goes through a Year 2000 analysis, this mystery field may never be
flagged as an issue to address. Why not? Because neither the field
name nor the data in the field can identify it as date-related.
As you
can see, Year 2000 issues can be deeply hidden in a system. Because
of this, automated tools can only go so far in identifying the
problem. When dealing with data, there is no substitute for good old
human know-how.
The Interpretation Problem
The
second Year 2000 problem is the one that will affect you most as a
desktop application developer. It is the problem of interpretation. Almost all software that is hosted
on the Microsoft Windows platform is designed to hold date data with
4 digits for the year. Development environments such as those found
in Microsoft Office store the full four digits of the year.
Therefore, it is very unlikely that you will have to restructure
systems to increase date storage size.
The
problem then is one of interpretationùeven though your applications
store four digits for the year, most will show only 2 digits on
data-entry forms, view forms and reports because this is how humans
most often think about years. (When was the last time you included
four digits for the year when writing a check?) So if a user types
in a two digit year, such as "1/1/99", what is actually stored in
your application? Unfortunately, the way applications interpret this
date differs. The rules for interpretation of short years (those
with only 2 digits) depend on a variety of factors. The rules used
by each of the products covered in this paper are discussed later.
The System Problem
Finally
there is the operating system to consider. The MS-DOS operating
system understands four digit dates up to the year 2108. Microsoft
Windows 3.11 and Microsoft Windows 95 support years up to 2108 also.
On the horizon, Microsoft Windows 98 will automatically handle all
DOS date century problems. Additionally, Microsoft Windows NT
handles BIOS and date century problems automatically as of version
4.0.
|