Your Universal Remote Control Center
RemoteCentral.com
Philips Pronto Classic Forum - View Post
Previous section Next section Previous page Next page Up level
Up level
The following page was printed from RemoteCentral.com:

Login:
Pass:
 
 

Page 1 of 2
Topic:
Huge IR Code Database
This thread has 18 replies. Displaying posts 1 through 15.
Post 1 made on Wednesday September 15, 2004 at 08:57
Dave Houston
RF Expert
Joined:
Posts:
October 2001
1,521
I came across this code database at [Link: intellanet.com].

According to Intallanet, the RAW codes are CCF hex. They do resemble CCF but ALL appear to be invalid. Pronto Edit cannot digest them. Even if you fix words 3 & 4, the resulting codes seldom appear to be accurate - having either too many or too few bits and having invalid patterns of bytes, complemetary bytes, etc.

The GC-100 format codes also appear to be 100% invalid.

Maybe someone else can see something that I'm missing. It sure would be nice if such a large collection of codes could be utilized.
Post 2 made on Wednesday September 15, 2004 at 10:18
johnsfine
IR Expert
Joined:
Posts:
September 2002
5,159
At first glance it appears to be pretty bad. Some of it may be good enough to get some useful information from in cases where no decent source is available.

If someone has some question about IR for a specific brand that is there but not elsewhere, I'll take a closer look.

For my initial look, I just focused on a Toshiba entry, because those are easier than most (I picked the CT 90157).

In that Toshiba sample, if you look at the Rawcode format, you'll see it's mostly right (after you make the obvious fix to the third and fourth words).

I see no pattern to the crap in the third and fourth words. Clearly they represent something that has little, if any, connection to what they represent in Pronto Hex. Also clearly they represent something that is so badly learned that it has little, if any, connection to the real signal.

In that sample, notice the Power signal is a good NEC sample without the repeat part and notice the '8' is a good NEC1 sample with the repeat part.

Then switch to "Intellanet" format and notice all signals have the exact same structure. Especialy note how similar the Power and '8' are.

That tends to rule out the guess that the Rawcode signals were regenerated (by buggy software) from some partially recognised form. Rather, one might deduce that the Rawcode is quite close to the internal form and the Intellanet form is generated from a crude pattern matching process on the raw. Unless that theory is wrecked by some other sample (where the Intellanet form contains data that couldn't be extracted from the raw), I'd suggest ignoring the Intellanet if you try to actually use any of the data.

For understanding the Intellanet (to find the above counter example): I haven't looked to see if that format is documented anywhere (have you?) but in that Toshiba sample, it's pretty clear the 2001 is some ID for the pattern that was matched. The 5'th and 6'th words are each the bit reverses of the meaningful half of what they'd be in Pronto 900A format. The 2'nd and the 7'th through 14'th words are all obvious. That leaves only the 3'rd, 4'th and 15'th in question.
OP | Post 3 made on Wednesday September 15, 2004 at 10:53
Dave Houston
RF Expert
Joined:
Posts:
October 2001
1,521
I've gotten their attention and am exchanging email with their technical folk.

I have to sample more codes but it looks like their GC-100 codes may be OK afterall. At least, the few that I've looked at this morning are OK. I think there were some extraneous spaces and text in the copies I pasted into CodeGenPro™.

I'll try to compare some where I know the CCF code is still messed up after fixing the word counts. (Some of their NEC codes have an extra bit.)

If the root codes they used are valid, then it's just a matter of fixing their conversion routines. But I worry that these were harvested and converted by a bot which means the provenance is unknown and therefore suspect.
Post 4 made on Wednesday September 15, 2004 at 11:33
johnsfine
IR Expert
Joined:
Posts:
September 2002
5,159
On 09/15/04 14:53 ET, Dave Houston said...
I have to sample more codes but it looks like
their GC-100 codes may be OK afterall.

I found a description of GC-100 format in a PDF file. But it doesn't quite fit those Toshiba samples. Can you explain:

That power code begins
sendir,1:1,1,38000,40,0,340,171

I understand the "sendir,1:1,1" does not relate to the structure of the IR signal.

The 38000 and then everything from the 340 on are obvious and match the PDF description.

But the 40,0 doesn't. The PDF said the 40 is the number of times the signal repeats, but this was an incomplete capture which cannot have any repeat. The 0 is the offset for the repeat, which the define in a weird way requiring it to be odd, which it isn't.
OP | Post 5 made on Wednesday September 15, 2004 at 12:09
Dave Houston
RF Expert
Joined:
Posts:
October 2001
1,521
After looking at a few more codes, the GC-100 codes are suspect, also. It just happened that the first few I looked at were OK - most are not.

John, the 40,0 is incorrect. 40 should be the number of repeats which you have to predefine since there's no "button" on the GC-100. The next number should indicate where the repeat sequence starts. It's been a while since I read the GC-100 docs but I think it should be "1" if the entire code is repeated or some other number if only part of the code repeats. I don't recall if it was an index of pairs or of individual entries in the comma delimited list. The GC-100 documentation wasn't very clearly written. I believe it refers to the initial (i.e. one-time) sequence as a preamble.

The nature of the GC-100 makes repeats problematic. It returns a message over the network when it has finished sending a code. You need to wait for the message before sending another code. Lengthy repeats tie things up.

If Intellanet doesn't either fix or remove the codes, they will likely be buried by the support burden they'll cause.
Post 6 made on Wednesday September 15, 2004 at 12:51
johnsfine
IR Expert
Joined:
Posts:
September 2002
5,159
On 09/15/04 16:09 ET, Dave Houston said...
The nature of the GC-100 makes repeats problematic.

That concept is a lot better than the similar aspect of Pronto Hex. It's amazing the Philips does nothing about the issue of repeats in a sequence, despite years of hearing customer complaints about it.

But, having never looked at GC-100 at all before, I have no idea whether their implementation shortchanges the potential of the concept.

It returns a message over the network when it
has finished sending a code. You need to wait
for the message before sending another code. Lengthy
repeats tie things up.

I don't have a good enough mental picture of how it all fits together to understand why the above is an issue. If a device needs a certain number of repeats then being able to specify that is better than kludging it some other way. If the IR emitter needs to be sending a long sequence then you'd expect that IR emitter to be tied up that long. If a configuration has IR emitters in multiple rooms, you'd like to be able to start one doing something before another has finished, but that may be more sophistication than you should expect from typical software.
OP | Post 7 made on Wednesday September 15, 2004 at 13:07
Dave Houston
RF Expert
Joined:
Posts:
October 2001
1,521
On 09/15/04 15:33 ET, johnsfine said...

But the 40,0 doesn't. The PDF said the 40 is
the number of times the signal repeats, but this
was an incomplete capture which cannot have any
repeat. The 0 is the offset for the repeat, which
the define in a weird way requiring it to be odd,
which it isn't.

I dug out my copy of the GC-100 API and checked my notes.

The first item (40 in your example) is the number of times to repeat the repeat sequence (if any). (Actually, it is the number of copies to send.)

The second item is the index (in the list that follows these) of the START of the repeat sequence. If converting a CCF hex, it will equal (Word3*2)+1. The offset will always be odd.

1,1 means send 1 copy of everything starting with index 1.

4,69 means send 1 copy of the first 68 items in the list and then 4 copies of the items starting at index 69.
OP | Post 8 made on Wednesday September 15, 2004 at 13:14
Dave Houston
RF Expert
Joined:
Posts:
October 2001
1,521
The problem comes in when you do not get the return message (perhaps because of network traffic) and send another code. If the first hasn't finished, it's truncated. You really have no way of knowing where the GC-100 might be in the sequence because your command to it may have been delayed.

I'd prefer a queue of codes and commands with a specific command to break in, if nexcessay.
Post 9 made on Thursday September 16, 2004 at 03:19
antonewest
Long Time Member
Joined:
Posts:
January 2004
24
I've used the following site quite a bit. I cut and paste these into neohacker for Junior and i've pasted codes directly into pronto edit. Codes seem to work fine for me. [Link: ir.premisesystems.com]
T
OP | Post 10 made on Thursday September 16, 2004 at 06:28
Dave Houston
RF Expert
Joined:
Posts:
October 2001
1,521
I was aware of that Motorola database on the Premise site. I suspect it's the root of the one on the Intellanet site. The latter is a bit easier to navigate. It's a shame it's so screwed up.
Post 11 made on Thursday September 16, 2004 at 10:01
johnsfine
IR Expert
Joined:
Posts:
September 2002
5,159
On 09/16/04 10:28 ET, Dave Houston said...
I was aware of that Motorola database on the Premise
site. I suspect it's the root of the one on the
Intellanet site.

I'm currious about what you saw as the clues leading to suspecting that. (Similarity in the set of models listed, or exact match in some learning errors, etc.)

While there are apparent translation errors in the Intellanet data, the examples I looked at also had plenty of errors that I'm convinced are learning errors rather than translation errors. The Premise site has remarkably few learning errors. Probably a lower rate of learning errors than any of the other major sources of IR signals I know of. That argues against the Intellanet site being some sort of translated copy of the Premise site.

The latter is a bit easier to
navigate. It's a shame it's so screwed up.

The only thing close to any navigation issue I see with the Premise site is they don't tell you device types, just model numbers. Usually you can't find a match on model number and you'd like to try all the matches of brand and type. If you don't know how to guess type from model number for a given brand, that gets much harder. Other than that I find Premise quite easy to navigate.
OP | Post 12 made on Thursday September 16, 2004 at 11:47
Dave Houston
RF Expert
Joined:
Posts:
October 2001
1,521
The shear size of both databases argues against them being learned by the companies that have published them. No company would spend that amount of money nor could they assign the personnel and give them the time to learn and document that many codes in our lifetime. Any company that did so would be bankrupt long before the project reached fruition.

The most likely origin is from a database from a company like UEI, X-10 or ???.

The Intallanet codes are so screwed up it's hard to say whether it's from poor learning or from a badly written program that harvested and translated them. All of their NEC codes have 33 data bits - it's difficult to attribute that to a bad learn. Bad learns tend to start or end in the middle of a code, like many in the Files area here.

You obviously haven't tried navigating the Premise site with FireFox.
Post 13 made on Thursday September 16, 2004 at 14:05
johnsfine
IR Expert
Joined:
Posts:
September 2002
5,159
On 09/16/04 15:47 ET, Dave Houston said...
The shear size of both databases argues against
them being learned by the companies that have
published them.

I agree, and I had wondered about that. All I can guess is that they have some good way of convincing professional installers (reselling their product) to give back IR code info.

The most likely origin is from a database from
a company like UEI,

I don't delieve UEI would let anyone get away with that. I don't believe UEI's database has function NAMES represented as well as those do, nor that it records IR data as badly (as even Premise). UEI has many brands Premise doesn't and Premise has brands UEI doesn't. So even though my knowledge of UEI's database is very indirect, I'm confident in denying that theory.

X-10 or ???.

I don't know about X-10. "???" covers a lot of ground.

The Intallanet codes are so screwed up it's hard
to say whether it's from poor learning or from
a badly written program that harvested and translated
them.

That's the sort of thing I know a lot about. I usually don't have a hard timing (with a decent size sample) determining whether IR data was screwed up in translation vs. screwed up in learning. The Intellanet codes have plenty of each.

All of their NEC codes have 33 data bits

Did you look at the Toshiba ones I mentioned? The Rawcode form of those is basically correct. The 32 (not 33) data bits are exactly correct in each one I looked at. The fact that some have the repeat part and some don't looks more like learning error than translation error.

- it's difficult to attribute that to a bad learn.

Do you have some specific 33 bit samples in mind. I don't recall noticing any and I'd rather look at a sample than guess from an abstract description.

Some of the TSU3000 bad learns posted at RC recently really streach the limits of what you might expect from bad learning. But in some cases enough details are posted of where they came from and how they vary across relearn attempts, that any theory other than bad learning gets pretty far fetched. At some level, that sort of "translation" vs. "learning" question is just a semantic issue, since "learning" is a kind of "translation" and the stranger errors are likely caused by firmware bugs rather than mechanical or analog issues in the IR to IR transfer. For this thread I assume "translation" means from one database to another and excludes errors during first capture.


Bad learns tend to start or end in the middle
of a code, like many in the Files area here.

That's one moderately common symptom. But there are plenty of bad learns that can't be characterised that way.

You obviously haven't tried navigating the Premise
site with FireFox.

Actually, I either interpreted "navigating" more narrowly or simply forgot a minor inconvenience because I work around it unconciously out of habit.

Both FireFox and Firebird "navigate" that site just fine for me, but can't access the Pronto Hex (which is the ultimate point of navigating the site). Once I "navigate" down to the model I want, I always right click and select "open in IE" in order to get to the Pronto Hex. Anyway, I guess you're right about this one, because it's using the resouce that matters, not some narrow definition of "navigating" and that's one more inconvenience on top of the lack of device types.

If you have IE installed at all, you ought to install the "Open in IE" right click option in FireFox. There are quite a few web sites that seem to do agressive IE detect and force themselves to fail if you are runing older IE or non IE. If it's an online vendor or other site subject to normal competion I just find one of their friendlier competitors. That right click is easier, but I avoid it on principle. If there are no decent alternatives I'll use that right click. I'm not fanatic about that "principle".
OP | Post 14 made on Thursday September 16, 2004 at 14:25
Dave Houston
RF Expert
Joined:
Posts:
October 2001
1,521
On 09/16/04 18:05 ET, johnsfine said...

I don't delieve UEI would let anyone get away
with that.

Reverse engineering is not illegal.

I don't know about X-10. "???" covers a lot of
ground.

X-10 built UEI's One-for-All remotes for many years. I don't know who owns the database. X-10 has claimed theirs is biggest.

??? covers a company in upstate NY, the name of which eludes me, that sells IR database chips.
EDIT: [Link: innotechsystems.com]

Do you have some specific 33 bit samples in mind.
I don't recall noticing any and I'd rather look
at a sample than guess from an abstract description.

Brand: nec
Model:
Remote: RC-6010

If you have IE installed at all

I only use IE for downloading patches from MS - most of which are for security flaws in IE.

This message was edited by Dave Houston on 09/16/04 14:40 ET.
Post 15 made on Thursday September 16, 2004 at 16:07
johnsfine
IR Expert
Joined:
Posts:
September 2002
5,159
On 09/16/04 18:25 ET, Dave Houston said...
Brand: nec
Model:
Remote: RC-6010

They must have assembled the database from multiple sources with very different characteristics. No single sequence of capture and translations could produce both those Toshiba samples and those NEC brand samples (I checked a few Yamaha as well and they look like the NEC, so you're likely right that most of the NEC protocol samples in this database look like those NEC brand samples.

Those samples look program generated. Especially when looking at both the Nec and the Yamaha, I can't believe any translation, even buggy, from one relatively raw format to another could produce those results. If not generated entirely with something like MakeHex from manufacturer code lists (like those Yamaha publishes), they were at least regenerated, such as being pattern recognised into some condensed form like Pronto 900A hex and reexpanded by a buggy program.

But it's even clearer that the Toshiba samples were not generated or regenerated that way. They came from raw captures and they retain characteristics that would be lost if too much translation occurred after capture.
Page 1 of 2


Jump to


Protected Feature Before you can reply to a message...
You must first register for a Remote Central user account - it's fast and free! Or, if you already have an account, please login now.

Please read the following: Unsolicited commercial advertisements are absolutely not permitted on this forum. Other private buy & sell messages should be posted to our Marketplace. For information on how to advertise your service or product click here. Remote Central reserves the right to remove or modify any post that is deemed inappropriate.

Hosting Services by ipHouse