|
|
|
|||||
|
We are happy to announce that Wrestling Legacy Data to the Web & Beyond: Practical Solutions for Managers & Technicians is available for ordering. Please respect our copyright! |
From Chapter 1: The Battle Cry! You’ve heard the questions from managers, consultants, and industry pundits:
And, the dreaded: |
|||||
|
The answers usually involve the words migrate or transform, which become the battle cries heard in every industry from banking to agricultural engineering. Those words, migrate and transform mean something different to everyone who uses them. For most companies it means making legacy data available beyond its typical paper delivery system, and that is not a challenge for the faint of heart. We are going to take a careful look at all of the factors that go into a successful marriage of legacy print applications and delivery of the legacy print data using alternative means, including the web, PDAs, and even electronic ink-based devices. When you reach the end of the book you should know what you need to know to ask intelligent questions of your vendors and colleagues, as well a where to go for more help. You should even be able to interpret the answers they give you! The simple facts in the
world of corporate high-speed printing is that not everyone who works with
the printers and print data has all of the terminology for all of the
components of their original format or their target format at their
disposal. We are going to try to help with that, too. First, we use the term
resource throughout the book. The term resource is not universally
understood, however. To be clear, we mean the fonts, forms, graphics, and
print environment files that go into controlling the print job. What does it mean to work with legacy data and bring it to alternative platforms? Engineering the solution is different for every company, but the mechanics and tools are remarkably similar. The hundred-thousand-foot overview is that you’ll need to uncover the deep dark secrets of all of your applications, then try to locate the best solutions for moving both the applications and the data they work with to the new platforms and delivery vehicles. All the while, you’ll be battling look and feel issues. When you move to the web:
Then, you move into even more interesting questions. What happens when you move to even smaller display devices like cell phone screens, PDA screens, or even purpose-built devices? What are your liabilities? What are your responsibilities? Beyond the look and feel of your applications, there are the security issues and access issues that raise their heads at every turn. About the time you think you’ve answered every question and encountered every problem, testing will uncover even more of those deep dark secrets. You will find graphics that were created with proprietary fonts and data that was re-mapped in COBOL, PASCAL, PL/I or even FORTRAN routines that may not move to the new platforms. The routines may be hidden in external procedures and files that hide them from view. You may even find programming written within the print applications using the printer datastream language! The older the application, the more likely it is that you’ll find interesting anomalies. However, do not be surprised if you find them in even your newest applications. If this sounds like a big job, it is. However, it is not an impossible job. Many companies have successfully moved applications originally designed for Xerox, IBM, generic line printers, and even other proprietary printers to new print environments, to the web, and beyond. They have taken many paths, and many innovations have been forged along the way. So, let’s look at the tools you need to determine how to transform the output from any application so that is available for use on any delivery device, no matter how big or how small, wired or wireless.
Legacy Data comes in a variety of sizes, shapes and descriptions. It may be the output of programs written in-house over the past 30 years, the output of commercial software installed over the past 30 years, or even the output of programs written or purchased within the past few months. Within the enterprise, it
may have many personalities, especially if the corporate culture permits
departments to build or buy their own application solutions. Some
departments may have grown from departmental systems through mid-range
systems like the IBM office systems, System/36s and AS/400s on their way to
network-based printing. Others may have selected office systems from
companies like Xerox, Datapoint, Data General, and Tandem, while others
remained on paper until the advent of the networked PC. And, there are those
who are at home on the big iron - developing all of their support
applications on the IBM or IBM plug-compatible hosts. Beyond basic line data, there are the complex additions to line data formats, most commonly using IBM’s Advanced Function Printing/Presentation (AFP) and Xerox Dynamic Job Descriptor Entries (DJDEs). The syntax for using AFP commands and DJDEs have evolved over the years, so you may find inconsistencies in coding and old syntax in your data. Take another step and you meet the more complex forms of AFP that include composed data representations with a myriad of variations. Xerox print applications can have the appearance of full application programs or they may use Xerox’s proprietary Metacode print datastream instead of or in addition to line data marked up with DJDEs. Don’t forget the data formatted for printing on PCL printers (from Hewlett-Packard or other vendors) and the data created for printing on PostScript devices or viewed in Acrobat using the Portable Document Format (PDF). Over time, all of these formats and languages have changed. Subtly at times, and dramatically at others, they have grown to accommodate the evolution of the languages and the devices they support. The programmers assigned to the print applications may have made changes to the base applications or written bridge code to reformat incoming data. As you can guess, they may not have documented everything they did in the rush to meet application deadlines. These things will make re-using the existing applications more challenging. This is the problem with legacy data. A Glance toward Your Applications Even after you identify the print characteristics of your applications, you will face the challenge of identifying the applications that produce the print and getting that old data into a format appropriate for your new output medium. Most large enterprises must still print most of their application output at some point in its life, and that print generally revolves around print formats like IBM’s AFP, Xerox DJDE/LCDS, line data of all varieties, and the original desktop formats like HP PCL, Adobe PostScript and Adobe PDF. Those applications generally use fonts with varied histories. Some of the fonts were installed with the printers and their basic programming applications as far back as the 1970s, when they were designed for use on lower resolution print devices. One of the big problems with legacy data revolves around the fonts used to develop the original application. There is a lot more detail about font issues in later chapters, but here we want to shine a light on font basics. Without a clear understanding of what fonts the data you have expects to use, it is easy to make poor decisions about how to handle font decisions for the new delivery environment.
The fonts that you use with your legacy data are files with information about how to map the data to a visual representation of the data. That representation is often specific to the hardware used for printing. When you try to migrate to new and exciting output devices it can be difficult to cause the data to appear identical to the original print because of the variations in the font file formats, their internal architectures for building the character images, and how they handle the white space between characters and between the lines. If you are old enough to remember typewriters, you might remember the difference between Pica and Elite typewriters. If you typed a document on a pica typewriter, you saw different line endings than you saw if you typed on an Elite typewriter. Pica and Elite typewriters each use a different number of fixed characters per inch. Those problems are multiplied a thousand-fold in the world of legacy migration. Prepare to make decisions regarding what fonts are critical to the look and usability of your documents, which have legal requirements associated with them, which have corporate branding issues associated with them, and which can reasonably change without requiring a vote of the corporate board of directors. The Look of the Document While one of the problems
you encounter is older fonts, another potential problem area involves the
graphics that populate your business documents. You may have corporate logos
and product branding icons associated with processes and functions, or even
signatures or text blocks that could not be rendered using a real font. Each
of the possibilities has has its own challenges. The file formats, including
fonts, graphics, electronic overlays/forms, and other resources used to
create your print, are only a part of the picture. They are the biggest
part, but not the only part. There is also the question of the real estate
available to present your data, how it is oriented (tall/portrait or
wide/landscape), and how you want to handle the navigation of the document
in its new form. These are big questions because a screen of any size is not
a piece of paper. Once you understand what
you have, you will move on to the process of determining the best tools for
making your data available in alternative environments. There are as many
approaches as there are enterprises around the world. Some of you will need
a one-time conversion of all resources and programs to accommodate
on-the-fly publishing to paper or screen, while others will favor an
approach that reconstitutes the data for the target alternative output
devices using batch transform programs. Still others will find combinations
of batch and WYSIWYG (What You See Is What You Get) products, including
print drivers, batch transform products and import/export schemes through
document creation tools, that create the best environment. What Can You Re-Purpose? For most companies this legacy data is the foundation of the corporate information database, which means that it has to be handled as carefully as possible. The starting point for working with it should be the identification of the applications that produce output, whether that output is printed directly from the application or passed to other programs for enhancement, re-purposing, printing, or storage. Can any application be re-purposed for the web or some other alternative device? While you will always find situations where the cost to re-purpose the data would not be worth it, the answer is that any application can have its output re-purposed. Some applications will be easier than others, and some just shouldn’t be moved. The only real requirements are that, at some point, you need access to all of the input, all of the output streams, and a way to describe and reformat the output for the new target delivery environments. While we will talk more
about moving legacy data to the web or web-enabled/web-friendly
environments, the procedures we walk through in the forthcoming pages will
work regardless of the actual target output environment, including cell
phones pagers, and electronic ink devices.
In this book, we are concentrating on the output of the applications and tools you use, not the tools and applications themselves. That would be a separate book. However, there are a few things you will want to know about the applications you use as you create your inventory lists and do your audits. Composition tools are that class of applications that cover mainframe and PC-based products used to add formatting information to text blocks. Composition tools come in two basic flavors: tagged/batch and What-You-See-Is-What-You-Get (WYSIWYG). Batch tools are most commonly found on the mainframe. The most popular batch composition tools remain IBM’s Document Composition Facility (DCF)/SCRIPT and it’s extension called BookMaster, and a product from Document Sciences, Inc. called CompuSet (formerly XICS). You might also find Waterloo Script, which has a lot of the same history as DCF, but was developed by the University of Waterloo in Canada, and so has some different features than IBM’s product. For example, it directly supported most Xerox printers early in its product cycle, while DCF requires third party add-ons to produce output to Xerox printers. Most these products owe their existence to Dr. Charles Goldfarb, a researcher at IBM who defined the method for making content and formatting independent of each other. His original application was for legal documents, but the methodology he defined worked for business documents across the board. In all of these products, there are sets of tags or controls that authors or document formatting specialists add to the text to cause formatting (and sometimes other processing) to occur. The tagged text files are then composed and the result is a print file. If you still have these
products in use, and many companies do, you will want to know what version
and release you are using, what default fonts are in use with these
products, and if you use the output of these composition tools with other
applications programs. WYSIWYG tools are generally found on the PC. They may be purpose-built for a specific environment, such as a forms development tool, or they may be general-purpose word processing systems, such as WordPerfect or Microsoft Word. General purpose PC tools often require some additional tool to produce output compatible with host-based applications or to produce output to the AFP and Xerox print environments, so look for critical items like print drivers or third-party utility programs if you know that you use PC-based tools for your development. If you are using one of a more recent class of composition tools, such as Dialogue from Exstream or Opus from Elixir, they generally provide composition, resource management, and multi-purpose output. Applications using these types of tools should migrate to the web and other devices without difficulty, but always talk to the internal experts and the vendor about how your environment is configured. ... Continued in Wrestling Legacy Data to the Web & Beyond: Practical Solutions for Managers & Technicians from MC2 Books. |