SEC file format

GypsyComet · Jan 13, 2009

dorward said:
tjoneslo said:

This is very important. The SEC file format is primarily a human readable format. All of the other formats under discussion are primarily for machine reading, with the ability to be read by a human. Until the computer becomes a integral part of the table top role playing experience, this requirement for a human readable first format for information won't change.

Click to expand...

There's no need for the format to be designed to be used on paper at the tabletop. Nobody reads raw Word documents on paper (or at all for that matter), they use tools to transform it into something friendlier.

You miss my point. What data format the computer uses is irrelevant if the results cannot be interpreted at an actual game. Yes, some of us still hold those, and no computers are present. Input and output options for older formats and paper must take that into account or you've spent your efforts to support the armchair admirals and not the people actually rolling dice.

dorward · Jan 13, 2009

GypsyComet said:
You miss my point. What data format the computer uses is irrelevant if the results cannot be interpreted at an actual game.

You miss my point. The format the computer uses and the results used at the table do not have to be the same thing.

GypsyComet · Jan 13, 2009

dorward said:
GypsyComet said:

You miss my point. What data format the computer uses is irrelevant if the results cannot be interpreted at an actual game.

Click to expand...

You miss my point. The format the computer uses and the results used at the table do not have to be the same thing.

No, that was rather my point to begin with. We're talking around the same conclusion, but from different perspectives, I think.

Build some large, complex and not-human-readable format. Include everything you think might be usefully predetermined for a system.

Just make sure it can both input and output traditional (or not so traditional) human-readable formats. Otherwise its worthless in play.

hhawk · Jan 13, 2009

For this post and others, when I say CSV I am referring to CSV with a column header.

dorward said:
I'm still decidedly unsold on CSV.

You have immense amounts of horizontal scrolling, and can't extend the format with data on new lines (such as histories of planets).

Horizontal scroll can be easily mitigated. Buy a second monitor or sort the columns to an order that mitigates scrolling (optional columns appear last).

New data is supported by adding new columns with new keys. You can put an entire text file in a column entry; you just have to escape sequence the ','.

dorward said:
XML lets you mix namespaces, so the format could be defined to make use of XHTML modules for text (which would address that particular usecase).

And that will make it a royal pain to parse the XML parse tree. Everyone will organize their hierarchies differently, and it will be like the tower of Babel.

dorward said:
I don't see CSV providing much protection against errors being introduced either. It just allows for different types of errors.

XML does not provide the error protection that you think. If there is an error in the XML how should a utility recover it? At least with CSV it is pretty easy to see that someone forgot a comma or two and either ignore those lines or require the user to fix them.

dorward said:
I don't think that limiting the format to make it easy to manipulate with existing tools is that wonderful an idea. It's nice in theory, but I think it requires too many sacrifices.

What sacrifices? There are no sacrifices with regards to the SEC format other than the file is not in SEC format. Pretty printing a csv file should be trivial and easy to customize according to user specifications.

dorward said:
Better to produce libraries to build tools with, and a first round of tools IMO.

I already provided the basic libraries for CSV and SEC formats.

dorward said:
Deniable said:

XML is overkill and has its own set of problems.

Click to expand...

It is a well understood format, with a lot of support for it.

CSV is a better understood format with more support for it. Simply put, CSV gets the job done with the least overhead and the smallest cost of entry for each application developer. Furthermore, a standard based on CSV is going to be much easier to comprehend than a standard based on XML.

dorward said:
Deniable said:

CSV will either have to have limited line length or will be as unreadable as XML.

Click to expand...

If CSV gets limited line length, then it makes it inextensible, so its unsuitable for this.

You can easily support unbounded line length by sorting columns correctly. Put column that have unbounded text or optional data last. Put the relevant data that each system has first. Pretty printers can allow you to print out a text file that exclude certain files (i.e. project the data that you want to view).

hhawk · Jan 13, 2009

Joshua Bell said:
Excel importing and exporting XML UWP data
http://www.travellerrpg.com/CotI/Discuss/showthread.php?t=11691
Highlights:
* DTD for MWM's T5 data (which is mastered in a Excel spreadsheet)
* Advocacy for classic UWPs-as-raw-text rather than broken down to excruciating detail in XML format

(IMHO, as long as we're talking about the same underlying data, the particular serialization is not critical - it's only a transform away.)

Well if the data is mastered in an Excel spreadsheet. CSV with a header is the way to go. With CSV the only thing that have to be defined are a set of column labels, and for each label, the format and meaning of the data contained within. Different UWP's can be handled with different label (e.g. UWP, UWPv2) therefore files can have backward compatible columns automatically generated by a tool to maintain the usefulness of older tools that are not being regularly updated.

hhawk · Jan 13, 2009

Deniable said:
The big thing about SEC is that it is a semi-formalized form of the sector data seen in CT/MT and later books. It's easy to interpret by eye and is supported by a lot of tools. Unless someone builds a better set of tools that use a new format, I don't see it getting supplanted.

This basically a flaw of the SEC format. They put the spacing in different places. To solve this problem all you have to do is figure out the boundaries for each column and provide them to the loader. It was my impression that many people want to build better tools, but they want to agree on a common file format that is easy to read and write.

kristof65 · Jan 13, 2009

Bottom line to me is that any new format should be as easily used as possible to a newcomer to Traveller with the software tools they already have. Someone completely new to Traveller should be able to download a sector file and open it up in their word processor and be able to make sense of the data (when interpreted with the Traveller rules, of course.)

SEC (and to some extent, CSV) do that, although not in the friendliest manner - IE, if Windows balks at opening an SEC file, not everyone knows that it's just basically plain text. Frustrations could ensue simply because someone doesn't know computers very well and the file format gets blamed, when it isn't the cause.

Supposedly XML when used with more recent word processors and office suites handles that problem. I'm not an XML expert though, so maybe what I've heard is wrong or I'm listening to the hype. Perhaps there is another standard - say one for Spreadsheets - that would be more applicable?

All in all, it's my opinion that no "committee" is going to be able to decide this. Basically, someone is going to create a tool and file format for it that just happens to gain wide acceptance. The tool and format that is the most flexible and easiest to use is probably going to be the winner.

Gentlemen - start your (programming) engines...

hhawk · Jan 13, 2009

kristof65 said:
Bottom line to me is that any new format should be as easily used as possible to a newcomer to Traveller with the software tools they already have. Someone completely new to Traveller should be able to download a sector file and open it up in their word processor and be able to make sense of the data (when interpreted with the Traveller rules, of course.)

Where do these newcomers get these tools and sector files? I am still looking for a good source of SEC files. If someone could provide me with information on where to get most SEC files, I could probably make an viewing utility that uses CSV.

kristof65 said:
SEC (and to some extent, CSV) do that, although not in the friendliest manner - IE, if Windows balks at opening an SEC file, not everyone knows that it's just basically plain text. Frustrations could ensue simply because someone doesn't know computers very well and the file format gets blamed, when it isn't the cause.

Supposedly XML when used with more recent word processors and office suites handles that problem. I'm not an XML expert though, so maybe what I've heard is wrong or I'm listening to the hype. Perhaps there is another standard - say one for Spreadsheets - that would be more applicable?

One standard for data transfer with spreadsheets is CSV ;-p

kristof65 said:
All in all, it's my opinion that no "committee" is going to be able to decide this. Basically, someone is going to create a tool and file format for it that just happens to gain wide acceptance. The tool and format that is the most flexible and easiest to use is probably going to be the winner.

Gentlemen - start your (programming) engines...

I agree that no committee can decide. However, feedback from a committee can be valuable. Another important feature of the tool is that its source needs to be separate from the data files. This way it can be distributed and used.

kristof65 · Jan 13, 2009

hhawk said:
Where do these newcomers get these tools and sector files? I am still looking for a good source of SEC files. If someone could provide me with information on where to get most SEC files, I could probably make an viewing utility that uses CSV.

It's been a while since I've looked and/or seen any. I'm sure they're still out there, though.

One standard for data transfer with spreadsheets is CSV ;-p

Don't get me started on CSV files and computer neo-phytes. I do tech support for a piece of equipment that can download reports as raw CSV files, and half my customers don't have a frickin clue how to deal with a CSV file, despite it's simplicity. If they can't click on it, and have it open in the appropriate program and immediately know what they are looking at, it is either useless to them, a tech support issue I have to deal with or both.

I'd like to see the same issue avoided here, although I will admit the average Traveller player is probably smarter than the types I give tech support to.

simonh · Jan 13, 2009

Joshua Bell said:
Excel importing and exporting XML UWP data
http://www.travellerrpg.com/CotI/Discuss/showthread.php?t=11691
Highlights:
* DTD for MWM's T5 data (which is mastered in a Excel spreadsheet)
* Advocacy for classic UWPs-as-raw-text rather than broken down to excruciating detail in XML format

(IMHO, as long as we're talking about the same underlying data, the particular serialization is not critical - it's only a transform away.)

Have you read the thread? After 5 pages of discussion the guy still wasn't able to open the file and read the data. I tried as well and I couldn't either.

One thing I noticed is that some extraneous forum markup had got mixed into the XML somehow, and I tried fixing it up manualy but couldn't get it to parse cleanly as well formed XML. That's one of the problems with the format, once it gets corrupt it's very hard to fix. Occasionaly I've been able to extract partial data sets from corrupt XML files, but rarely.

Simon Hibbs

CaptnBrazil · Jan 13, 2009

There are regular expressions to parse SEC files as long as they are in the basic order expected. I think the traveller map site does this, and I know I do (so far it's been tested on 3 varients of the SEC files and gets the data out correctly regardless of fields not all being the same width).

If any file gets garbled (XML, the flat SEC files, binary) the parsers usually can't do anything with them.

The only way an updated SEC file would take hold is for one of the license holders to actually update it & release software for it. I don't see that happening as (1) commercial software development is not cheap, (2) the market is not large enough to release anything at a viable price for most gamers, and (3) there are a LOT of SEC files floating around, and while you could extend the end of them & probably not effect existing software, you could not change anything in the middle.

As for XML, there are, as always, good and negative points. I like & use it simply because I can keep adding additional detail that has no effect on the software I've written. It simply ignores the new stuff. So for instance, my world XML may have something like:

Code:

<system>
  <main=some system>
    <starport>A
      <quality>A</quality>
      <size>4</size>
      <hi>3</hi>
    </starport>
    <UPP>123456</UPP>
    <tech>8
      <max>A</max>
      <avg>8</avg>
    </main>
</system>

The starport modifiers got added later, when I wanted to add additional info about starports. It was not in the original iteration, and the existing code just ignores the new stuff & works fine, and the new stuff can use it.

However - just remember that there are more software wars on languages than Traveller canon wars! :roll:

CaptnBrazil · Jan 13, 2009

BTW, the Rregex for SEC files:

Code:

        private static readonly Regex worldRegex = new Regex( @"^" +
            @"( \s*             (?<name>        [^\s.](.*?[^\+\s.])?  ) )? \+?\.* " +    // Name
            @"( \s*             (?<hex>         \d\d\d\d              ) )      " +    // Hex
            @"( \s+             (?<uwp>         \w{7}-\w              ) )      " +    // UWP (Universal World Profile)
            @"( \s+             (?<base>        \w | \*               ) )?     " +    // Base
            @"( \s{1,3}         (?<codes>       .{10,}?               ) )      " +    // Codes
            @"( \s+             (?<zone>        \w                    ) )?     " +    // Zone
            @"( \s+             (?<pbg>         \d[0-9A-F][0-9A-F]    ) )      " +    // PGB (Population multiplier, Belts, Gas giants)
            @"( \s+  (\w\w\/)?  (?<allegiance>  (\w\w\b|\w-|--)       ) )      " +    // Allegiance
            @"( \s*             (?<stellar>     .*                    ) )      "        // Stellar data (etc)
            , RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.Singleline | RegexOptions.ExplicitCapture | RegexOptions.IgnorePatternWhitespace );

(C# - someone sent that to me, and I use C#. You can use this to both validate a record & extract out the data via the (?<name> stuff)

Joshua Bell · Jan 14, 2009

CaptnBrazil said:
The starport modifiers got added later, when I wanted to add additional info about starports. It was not in the original iteration, and the existing code just ignores the new stuff & works fine, and the new stuff can use it.

If we're restricting the data to *tabular* structures, then SEC or CSV or XML doesn't really matter. CSV can be extensible... if you limit the extensibility to additional columns.

What XML or JSON (which I'm partial to) offer is a richer data structure (trees, annotations, etc) and extensibility. XML perhaps goes too far, JSON perhaps not far enough (there's no standard for namespaces in JSON, for example), but whatever. (And, of course, you *can* store any data structure in multiple tables, it just gets icky to peruse. Semi-human-readable formats are nice.)

I'm more interested in a common format for exchanging non-tabular Traveller data, since there *is* no standard for it. SEC is easy. Data for individual systems (down to the planetary map level) or metadata about sectors or collections of sectors or even whole milieux is what's important to me.

AKAramis · Jan 14, 2009

Joshua Bell said:
CaptnBrazil said:

The starport modifiers got added later, when I wanted to add additional info about starports. It was not in the original iteration, and the existing code just ignores the new stuff & works fine, and the new stuff can use it.

Click to expand...

If we're restricting the data to *tabular* structures, then SEC or CSV or XML doesn't really matter. CSV can be extensible... if you limit the extensibility to additional columns.

Actually, it's not that hard to do a human readable ascii flatfile format that is both extensible and flexible.

The trick is in establishing the 1st character of a line as a delimiter of record type.

For example, using a 7 character window per line for "header"

H_LLLL_ H: header type LLLL location.
If LLLL is ----, rest of line is delimiters for same type; if LLLL is ---N, N is field from line data, and rest of line is name of field. Otherwise, LLLL is hexnumber

For example, we'll encode the structure from the Wiki

Code:

M ---- 11111111111111HHHH 222222222  3  444444444444444 5  678 99 AAAAAAAAAAAAAAAA
M ---  whitespace
M ---1 Name
M ---2 UWP
M ---3 Bases
M ---4 Trade Codes & Comments
M ---5 Zone
M ---6 PopMult
M ---7 Belts
M ---8 GG 
M ---9 Allegiance
M ---A Stellar Data
M ---H hex number
# ---- 1111111111111111111111111111111111111111111111111111111111111111111111
N ---1 note
S ---- 111111111111111 2222 33333333
S ---1 Sector Name
S ---2 Capital Hex
S ---3 reference date
T ---- 1  222222222222222 3333
T ---1 Subsector ID
T ---2 Subsector name
T ---3 Capital Hex
S ---- Spinward Marches
# 0000 The Spinward Marches data was ftped from Sunbane.
# 0000 Similar data seems to have been published in Supplement 3,
# 0000 The Spinward Marches, put out by Game Designers' Workshop
# 0000 in 1979. The Spinward Marches gives no credits, so it's
# 0000 difficult to say who precisely was involved in its creation.
T 0000 A: Chronor
T 0000 B: Jewell
T 0000 C: Regina
T 0000 D: Aramis
T 0000 E: Querion
T 0000 F: Vilis
T 0000 G: Lanth
T 0000 H: Rhylanor
T 0000 I: Darrian
T 0000 J: Sword Worlds
T 0000 K: Lunion
T 0000 L: Mora
T 0000 M: Five Sisters
T 0000 N: District 268
T 0000 O: Glisten
T 0000 P: Trin's Veil
M 0101 Zeycude            0101 C330698-9   Na Ni Po De       613 Zh M9 V
M 0102 Reno               0102 C1207B9-A   Na Po De          603 Zh G8 V M1 D
# 0102 Gambling is legal for offworlders, but offworlders must prepay entire 
# 0102 stay and have booked passage off to receive gambling card.
M 0103 Errere             0103 B263664-B Z C0 Ni Ri          910 Zh M1 V M4 D
# 0103 Zhodani vacation world.
M 0104 Cantrel            0104 C366243-9   Lo Ni             520 Zh F1 III
M 0105 Gyomar             0108 D8B2889-5   Fl                824 Na A8 IV
M 0106 Atson              0111 B310598-8   Ni                933 Na K8 VI

CaptnBrazil · Jan 14, 2009

then you have to have the key to read it. The reality of it is that there are as many ways to approach this as there are programmers - and everyone will think that theirs is the *best* way (when in reality, hyperbole is the best!)

I deal with dozens of file formats, and then have to translate them to our core system. There really is no best per se as they all carry the data that they need to. We even get some files in a mixed pseudo-hex/binary/ASCII format (each record has some of each).

CSV can be added to forever and are easy to parse. You just have to know what the columns are. Flat, fixed length records, same thing. XML & JSON at least, to me, of having the advantage of metadata. But then they have to carry some excess data for that metadata so the files are necessarily larger.

In all cases, somewhere you have to write the parser to extract/store the info back into the file. and if you are going to do that, it really makes no difference to the actual file as you will be reading over the specs & writing the parser around that anyway.

So I maintain that pretty much people will write their own file version - I just don't see the Traveller market as large enough for a company to come up with anything. I know that Universe 2 is being written, but it uses MS SQL to hold the data (although I imagine it can read SEC files in) according to what I've read.

Deniable · Jan 14, 2009

dorward said:
Deniable said:

CSV will either have to have limited line length or will be as unreadable as XML.

Click to expand...

If CSV gets limited line length, then it makes it inextensible, so its unsuitable for this.

Well, if we're careful, CSV can be extensible without a lot of trouble. Add to this the commonality of split() in various languages and libraries and it may be a winner.

Deniable said:
Deniable said:

and is supported by a lot of tools. Unless someone builds a better set of tools that use a new format, I don't see it getting supplanted.

Click to expand...

Building better tools is the reason I'm interested in coming up with a new format.

Well, build it and they will come, otherwise we'll talk about it forever.

Deniable · Jan 14, 2009

hhawk said:
Where do these newcomers get these tools and sector files? I am still looking for a good source of SEC files. If someone could provide me with information on where to get most SEC files, I could probably make an viewing utility that uses CSV.

Deniable said:
The best source I've found for SEC files is The Missouri Archive and specifically this directory.

Berka also has a good collection of data.

AKAramis · Jan 14, 2009

CaptnBrazil said:
then you have to have the key to read it. The reality of it is that there are as many ways to approach this as there are programmers - and everyone will think that theirs is the *best* way (when in reality, hyperbole is the best!)

The trick is to have the key be part of the document. And to have it be self-evident to the human reader, as part of the header. Note that, in actual practice, I'd move all the hex numbers to the left, and delete them from after the system name.

ASCII/UTF-8 flatfiles have the advantage of being widely platform agnostic, small, and readily transmitted.

.sec has survived because it is just so portable and serviceable.

tjoneslo · Jan 14, 2009

hhawk said:
Where do these newcomers get these tools and sector files? I am still looking for a good source of SEC files. If someone could provide me with information on where to get most SEC files, I could probably make an viewing utility that uses CSV.

The traveller wiki has a collection of 97 SEC files. Enjoy.

CaptnBrazil · Jan 14, 2009

AKAramis said:
The trick is to have the key be part of the document. And to have it be self-evident to the human reader, as part of the header. Note that, in actual practice, I'd move all the hex numbers to the left, and delete them from after the system name.

ASCII/UTF-8 flatfiles have the advantage of being widely platform agnostic, small, and readily transmitted.

.sec has survived because it is just so portable and serviceable.

I agree - and I like the format you laid out. I'd probably not use it, but it is easy to read, easy to parse, and expandable. I'm thinking I like the idea of XML simply because in theory you can do some interesting search & collation on it via existing XPATH parsers, whereas you'd have to do it via manually reading the entire file (well, not manually, but parse line by line) for a straight text file (or, well, hey - if you've command line utilities such as grep I suppose you could roll your own search feature. I know I've written a Windows grep utility to search for specific things (a very, very limited grep!))

Now if that format took off, I'd probably implement it as well. Assuming I ever actually finish any of the Traveller programs I write! But as I develop my Traveller class for stuff (primarily trade, but it has this feature creep going on...) I can add additional backend parsers as required. Front end does not care what format the data is. One of the reasons I'm so pro-object oriented now under the right conditions: create the proper class and you can entirely separate the file I/O from the UI. Change file formats, it has no effect on the actual program, just an updated backend class.

SEC file format

Emperor Mongoose

Mongoose

Emperor Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Mongoose

Banded Mongoose

Mongoose

Similar threads