(10:00:00 PM) racke: everyone ready for the meeting? (10:00:26 PM) racke: agenda: http://www.icdevgroup.org/pipermail/interchange-users/2009-November/051387.html (10:00:34 PM) pj: well, kinda, again,I'm not going to entirely be here (10:00:52 PM) racke: no big deal, pj (10:01:18 PM) racke: we feel with you :'( (10:01:23 PM) matjones_: dont worry, i'm not all here either ;) (10:02:56 PM) racke: ok, the following happened last week with Interchange source code: (10:03:25 PM) racke: 1. Syslog overhaul by Jon - this needs update of documentation as well, any taker? (10:04:06 PM) racke: 2. CGI parsing patch by Racke to fix UTF8 binary upload bug #268 (10:05:15 PM) racke: 3. max_matches patch from Jon (done for batrams' (DB) site) (10:05:36 PM) bgerber left the room (quit: Read error: 110 (Connection timed out)). (10:10:31 PM) racke: any treatment of your keyboard which results in recognizable output is allowed now :-D (10:19:07 PM) batrams [i=600bee22@gateway/web/freenode/x-dvusptshietqhxrf] entered the room. (10:20:35 PM) endpoint_david: as a side note, I think the email headers patch could/would work in most cases if we assume charset=latin1 when !MV_UTF8 (10:20:48 PM) endpoint_david: i.e., more brokenness would be right in general (10:21:03 PM) racke: hello batrams (10:21:09 PM) batrams: hi (10:21:12 PM) racke: which email headers patch? (10:21:29 PM) endpoint_david: the one we've discussed, nothing material yet... :-) (10:21:37 PM) endpoint_david: "patch" may be optimistic (10:21:40 PM) racke: aha (10:22:57 PM) endpoint_david: 8-bit headers are broken no matter what, so everything will need to be encoded for email regardless of whether we're using utf8 or not (10:23:31 PM) endpoint_david: (i.e., we can't assume 8-bit SMTP-clean, plus the standards are us-ascii only for the headers) (10:23:58 PM) racke: ok, but the world doesn't break down with latin1 headers (10:24:47 PM) endpoint_david: yeah, that's probably other mail clients' defaults when unspecified (10:25:14 PM) pj: isn't there a specified defalt? (10:25:39 PM) pj: an RFC that says, use character encoding "X" unless otherwise specified? (10:25:41 PM) racke: headers are ASCII (10:25:59 PM) racke: unless you warp the header with the funny notation (10:26:25 PM) pj: ok, how are high charcters encoded, then? url encoding or ??? (10:26:40 PM) racke: no special encoding (10:27:03 PM) pj: so there's no way to legitimately have an email subject in chineese? (10:27:05 PM) racke: Bugzilla does the encoding like that: (10:27:08 PM) racke: http://src.chromium.org/viewvc/chrome/branches/WebKit/BugsSite/Bugzilla/Mailer.pm?view=markup&pathrev=30348 (10:27:12 PM) racke: pj: it is (10:27:13 PM) endpoint_david: pj: MIME-B (10:27:50 PM) endpoint_david: which Encode has built-in support for (although at what version it was introduced, I'm not sure) (10:27:55 PM) racke: it looks like that (10:27:58 PM) racke: Subject: =?UTF-8?Q?Zwei_neue_features_f=C3=BCr_den_F=2DShop?= (10:28:10 PM) pj: ok (10:28:22 PM) pj: so a charset can be encoded into individual headers (10:28:24 PM) endpoint_david: yeah, that's MIME-Q, which could come in handy if you can't read base64 ;-) (10:28:41 PM) racke: yeah, but you have to do it for any non-ASCII header (10:28:48 PM) pj: right (10:29:23 PM) racke: ok, but most likely it's the subject (10:29:53 PM) pj: yes, there may be one or two other headers that use it, but the subject is the one example where you will see it most by far. (10:30:21 PM) racke: or From: Peter Äjamian O:-) (10:30:25 PM) endpoint_david: it's a noop if it's ascii, so an easy fix would be to join "\n", map {encode('MIME-Q', $_)} split "\n", $email_headers (10:30:46 PM) endpoint_david: which if it's a UTF-8 string will do the right thing (10:30:51 PM) racke: are we sure that input is UTF-8 :-) (10:31:05 PM) endpoint_david: if not, it should go latin-1 (10:31:08 PM) pj: right, also in email addresses themselves, though I think they may be supposed to use some other special encoding. (10:31:27 PM) endpoint_david: hmm, they might use a variation of punycode for emails, haven't seen that specifically (10:31:44 PM) racke: yeah (10:31:49 PM) racke: oh and what's that? (10:31:57 PM) racke: @EXPORT = qw ( (10:31:57 PM) racke: add_items (10:31:57 PM) racke: do_order (10:31:57 PM) racke: check_order (10:31:57 PM) racke: check_required (10:31:57 PM) racke: encrypt_standard_cc (10:31:57 PM) racke: mail_order (10:31:57 PM) racke: onfly (10:31:57 PM) racke: route_order (10:31:57 PM) racke: update_quantity (10:31:57 PM) racke: validate_whole_cc (10:31:57 PM) racke: ); (10:31:57 PM) racke: push @EXPORT, qw ( (10:31:57 PM) racke: send_mail (10:31:57 PM) racke: ); (10:32:04 PM) endpoint_david: an ascii encoding for unicode DNS (10:32:09 PM) endpoint_david: wikipedia has some (10:32:12 PM) endpoint_david: info on it (10:32:15 PM) racke: why isn't that one statement (10:32:48 PM) pj: racke: git blame and find out (10:33:42 PM) pj: it probably can be, though (10:33:51 PM) racke: can that look into Minivend RCS ?? (10:34:22 PM) pj: I believe it has minivend revisions loaded, but it won't tell you much that far back. (10:34:47 PM) pj: at best you would be able to see what release introduced it. (10:40:41 PM) racke: Subject: Bestellbestätigung F-Shop, U8: (10:40:45 PM) racke: hmm (10:52:27 PM) rsiddall [i=189705da@gateway/web/freenode/x-dsqwbsxflyfbrwwp] entered the room. (10:55:02 PM) endpoint_david: racke: is the first @EXPORT list a paste mistake? :-) (10:55:47 PM) racke: no these are functions within Vend::Order (10:56:18 PM) racke: send_mail is in Util.pm but globbed into Order.pm at the end of the module (10:56:27 PM) racke: hello rsiddall (10:59:53 PM) rsiddall: hello racke (11:02:42 PM) racke: 127.0.1.1 esh7EqHi:127.0.1.1 - [17/November/2009:23:01:11 +0100] ulisses www.ulisses.exp:443/cgi-bin/ic/ulisses/checkout Runtime error: Can't call method "encode" on unblessed reference at /usr/lib/perl/5.10/Encode.pm line 158. (11:02:44 PM) racke: hmmm (11:02:51 PM) endpoint_david: ok, so what's the difference between Vend::Email and email.tag? i.e., why do we need both? (11:03:05 PM) racke: discard Vend::Email for now (11:03:21 PM) racke: I'm looking at send_mail for nw (11:03:33 PM) endpoint_david: and should email.tag and email_raw.tag's functionality be folded into Vend::Email, with a trivial wrapper? (11:03:35 PM) endpoint_david: ok (11:03:52 PM) racke: that was the idea, but never finished (11:05:47 PM) k2b3 left the room ("I'm not here right now."). (11:05:57 PM) racke: ok, wait a minute (11:06:05 PM) racke: early bird went away (11:06:20 PM) racke: aaaaaaaaaaaaaaaah (11:06:38 PM) racke: unless (is_utf8($subject)) { (11:06:38 PM) racke: utf8::decode($subject); (11:06:38 PM) racke: } (11:06:45 PM) racke: $subject = encode('MIME-Q', $subject); (11:06:55 PM) racke: that seems to work now (11:07:12 PM) racke: Thunderbird likes it now (11:07:14 PM) endpoint_david: that will fail on non-utf8 8-bit encodings (11:07:18 PM) endpoint_david: FYI (11:07:49 PM) endpoint_david: work great for utf8 and us-ascii, though :-) (11:07:54 PM) racke: why would that fail? (11:08:18 PM) endpoint_david: trying to decode non-utf8 octets (11:08:44 PM) racke: apparently my $subject is neither utf8 nor us-ascii (11:08:46 PM) endpoint_david: (it may be non-strict and pass-through, so it may work in practice) (11:08:51 PM) racke: this stuff is really confusing (11:08:58 PM) racke: ok let's test without that (11:13:07 PM) racke: now I get Subject: =?UTF-8?Q?Bestellbest=C3=83=C2=A4tigung=20F?==?UTF-8?Q?=2DShop?= (11:13:10 PM) racke: instead of (11:13:27 PM) racke: =?UTF-8?Q?Bestellbest=C3=A4tigung=20F?==?UTF-8?Q?=2DShop?= (11:13:35 PM) racke: the latter works .. (11:14:03 PM) racke: aha (11:14:11 PM) racke: the string is defined in an usertag (11:14:25 PM) racke: which is not read as UTF8 (11:14:43 PM) steamguy left the room (quit: "KVIrc Insomnia 4.0.0, revision: , sources date: 20090520, built on: 2009/06/06 11:44:47 UTC http://www.kvirc.net/"). (11:18:48 PM) racke: now I got (11:18:52 PM) racke: 127.0.1.1 esh7EqHi:127.0.1.1 - [17/November/2009:23:16:49 +0100] ulisses www.ulisses.exp:443/cgi-bin/ic/ulisses/checkout Subject: Bestellbestätigung F-Shop, U8: 1 (11:18:52 PM) racke: 127.0.1.1 esh7EqHi:127.0.1.1 - [17/November/2009:23:16:49 +0100] ulisses www.ulisses.exp:443/cgi-bin/ic/ulisses/checkout Subject: =?UTF-8?Q?Bestellbest=C3=83=C2=A4tigung?==?UTF-8?Q?=20F=2DShop?=. (11:18:59 PM) racke: so still broken (11:20:13 PM) endpoint_david: yeah, the second subject is doubly-encoded utf8 (11:21:07 PM) endpoint_david: even though the Q encoding is supposed to me more readable, it's hardy, IMHO (11:21:13 PM) endpoint_david: *hardly so (11:22:54 PM) racke: that one is a tricky beast, I tell you! (11:29:08 PM) endpoint_david: this is why I think we need the latin-1 decode initially if it's not already utf8 (11:38:08 PM) racke: what latin-1 decode, how can we do that? (11:38:47 PM) racke: it should be UTF-8 anyway (11:39:08 PM) racke: unless it stems from another bug :-) (11:39:46 PM) racke: let me run another test (11:39:54 PM) batrams left the room (quit: "Page closed"). (11:43:37 PM) racke: let me hack on it until the next meeting (11:43:59 PM) endpoint_david: unless (is_utf8($subject)) { (11:43:59 PM) endpoint_david: $subject = Encode::decode('latin1', $subject); (11:44:00 PM) endpoint_david: } (11:44:12 PM) endpoint_david: or we could do a: (11:44:18 PM) racke: maybe I can cut the Guardian knot (11:44:42 PM) racke: unfortunately it comes out broken if it's UTF8 (11:45:06 PM) endpoint_david: $subject = eval { Encode::decode_utf8($subject) } || eval { Encode::decode('latin1',$subject)) }; (11:45:25 PM) endpoint_david: which, if you skip the if block? (11:46:00 PM) racke: uh what does this statement do? (11:46:21 PM) endpoint_david: it'll try to decode as utf8 first, then failing that, as latin1 (11:46:51 PM) endpoint_david: the eval will return undef if it fails, so you can short-circuit in that case (11:47:02 PM) racke: why do we need the eval? (11:47:51 PM) endpoint_david: we don't in the second case, I suppose; it'd die in the first case if the decode failed (11:48:17 PM) endpoint_david: (or you can pass a check param to decode_utf8 to make it do so; I think it dies by default, though) (11:54:56 PM) rsdvd [n=rsdvd@rsdvd1.plus.com] entered the room. (11:55:13 PM) racke: hello phil (11:55:22 PM) racke: I would like adjourn the meeting now (11:55:28 PM) rsdvd: Hi Stefan (11:55:35 PM) rsdvd: perfect timing :-) (11:55:47 PM) racke: too late and that's need patient hacking (11:56:00 PM) racke: but of course I'm happy on any help I get, David (11:56:25 PM) racke: well, if you have a question Phil, go ahed (11:56:31 PM) racke: ahead :-), damn fingers (11:59:00 PM) rsdvd: no questions - just interested in what was being discussed - I will read the log when someone posts it (11:59:27 PM) racke: yes, not so much fuzz today :-) (11:59:37 PM) rsdvd: :-) (11:59:51 PM) racke: Jon is on the road, etc. (12:00:17 AM) racke: ok, good night #interchange!