Discussion:
OpenSSL and compression using ZLIB
Le Saux, Eric
2002-11-12 00:21:52 UTC
Permalink
OpenSSL (0.9.6g) has support for compression, both using RLE and ZLIB.
The way ZLIB is used, calls to the compress() function are made on each
block of data transmitted.
Compress() is a higher-level function that calls deflateInit(), deflate()
and deflateEnd().

I am trying to understand why ZLIB is being used that way. Here is what
gives better results on a continuous reliable stream of data:

1) You create a z_stream for sending, and another z_stream for
receiving.
2) You call deflateInit() and inflateInit() on them, respectively,
when the communication is established.
3) For each data block you send, you call deflate() on it. For each
data block you receive, you call inflate() on it.
4) When the connection is terminated, you call deflateEnd() and
inflateEnd() respectively.

There are many advantages to that. For one, the initialization functions
are not called as often.
But by far, the main advantage is that you can achieve good compression even
for very small blocks of data. The "dictionary" window stays open for the
whole communication stream, making it possible to compress a message by
reference to a number of previously sent messages.

Thank you for sharing your ideas on this,

Eric Le Saux
Electronic Arts
Bear Giles
2002-11-12 04:13:55 UTC
Permalink
Post by Le Saux, Eric
I am trying to understand why ZLIB is being used that way. Here is what
1) You create a z_stream for sending, and another z_stream for
receiving.
2) You call deflateInit() and inflateInit() on them, respectively,
when the communication is established.
3) For each data block you send, you call deflate() on it. For
each data block you receive, you call inflate() on it.
You then die from the latency in the inflation/deflation routines. You
have to flush the deflater for each block, and depending on how you do
it your performance is the same as deflating each block separately.
Post by Le Saux, Eric
4) When the connection is terminated, you call deflateEnd() and
inflateEnd() respectively.
...
Post by Le Saux, Eric
But by far, the main advantage is that you can achieve good compression
even for very small blocks of data. The "dictionary" window stays open
for the whole communication stream, making it possible to compress a
message by reference to a number of previously sent messages.
If you do a Z_SYNC_FLUSH (iirc), it blows the dictionary. This is
intentional, since you can restart the inflater at every SYNC mark.

I thought there was also a mode to flush the buffer (including any
necessary padding for partial bytes) but not blowing the dictionary, but
I'm not sure how portable it is.

______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Le Saux, Eric
2002-11-12 18:13:19 UTC
Permalink
I will try to explain what goes on again.

OpenSSL uses ZLIB compression in the following manner:
On each block of data transmitted, compress() is called.
It's equivalent to deflateInit() + deflate() + deflateEnd().

On a reliable continuous stream of data you can use it in the following way:
You call deflateInit() when the connection is established.
You call deflate() for each bloc to transmit using Z_SYNC_FLUSH.
When the connection closes, you call deflateEnd().

In the latter case, you do not initialize and destroy the dictionary for
each block you transmit.

Now there are three options to deflate, Z_NO_FLUSH, Z_SYNC_FLUSH and
Z_FULL_FLUSH. For interactive applications, you need to flush, otherwise
your block of data may get stuck in the pipeline until more data pushes on
it. Using Z_SYNC_FLUSH, you force the compressor to output the compressed
data immediately. With Z_FULL_FLUSH, you additionally reset the
compressor's state.

I ran tests using these options, and on our typical datastream sample, it
meant for us a compression factor of 6:1 with Z_SYNC_FLUSH and 2:1 with
Z_FULL_FLUSH. With Z_SYNC_FLUSH, the dictionary is not trashed.

The way OpenSSL uses ZLIB, resetting the compressor's state after each block
of data, you would achieve similar results as with Z_FULL_FLUSH.

I hope this clarifies things.

So I am still wondering if there is a reason why each block of data is
compressed independently from the previous one in the OpenSSL use of
compression. Is it an architectural constraint?


Eric Le Saux
Electronic Arts

-----Original Message-----
From: Bear Giles [mailto:***@coyotesong.com]
Sent: Monday, November 11, 2002 8:14 PM
To: openssl-***@openssl.org
Subject: Re: OpenSSL and compression using ZLIB
Post by Le Saux, Eric
I am trying to understand why ZLIB is being used that way. Here is what
1) You create a z_stream for sending, and another z_stream for
receiving.
2) You call deflateInit() and inflateInit() on them, respectively,
when the communication is established.
3) For each data block you send, you call deflate() on it. For
each data block you receive, you call inflate() on it.
You then die from the latency in the inflation/deflation routines. You
have to flush the deflater for each block, and depending on how you do
it your performance is the same as deflating each block separately.
Post by Le Saux, Eric
4) When the connection is terminated, you call deflateEnd() and
inflateEnd() respectively.
...
Post by Le Saux, Eric
But by far, the main advantage is that you can achieve good compression
even for very small blocks of data. The "dictionary" window stays open
for the whole communication stream, making it possible to compress a
message by reference to a number of previously sent messages.
If you do a Z_SYNC_FLUSH (iirc), it blows the dictionary. This is
intentional, since you can restart the inflater at every SYNC mark.

I thought there was also a mode to flush the buffer (including any
necessary padding for partial bytes) but not blowing the dictionary, but
I'm not sure how portable it is.

______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
David Schwartz
2002-11-12 18:26:52 UTC
Permalink
Post by Bear Giles
Post by Le Saux, Eric
I am trying to understand why ZLIB is being used that way. Here is what
1) You create a z_stream for sending, and another z_stream for
receiving.
2) You call deflateInit() and inflateInit() on them, respectively,
when the communication is established.
3) For each data block you send, you call deflate() on it. For
each data block you receive, you call inflate() on it.
You then die from the latency in the inflation/deflation routines. You
have to flush the deflater for each block, and depending on how you do
it your performance is the same as deflating each block separately.
I think you're totally missing his point. Yes, you have to flush (sync
really) the compressor at the end of each block to ensure that all the
compressed data comes out the other side when that block is received. But
this doesn't mean you have to purge the dictionary!
Post by Bear Giles
Post by Le Saux, Eric
But by far, the main advantage is that you can achieve good compression
even for very small blocks of data. The "dictionary" window stays open
for the whole communication stream, making it possible to compress a
message by reference to a number of previously sent messages.
If you do a Z_SYNC_FLUSH (iirc), it blows the dictionary. This is
intentional, since you can restart the inflater at every SYNC mark.
No, this is not true. A Z_SYNC_FLUSH does not permit you to restart the
inflater at every SYNC mark. That is not its purpose.
Post by Bear Giles
I thought there was also a mode to flush the buffer (including any
necessary padding for partial bytes) but not blowing the dictionary, but
I'm not sure how portable it is.
That is what Z_SYNC_FLUSH does. The usual strategy when processing a block
is to use Z_SYNC_FLUSH if the input queue is empty. Otherwise, don't bother
to flush because you know more data is coming shortly.

DS


______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Le Saux, Eric
2002-11-13 01:07:51 UTC
Permalink
RFC2246 mentions "compression state" in their list of connection states.
It also says the following:

6.2.2. Record compression and decompression

[snip snip] The compression algorithm translates a
TLSPlaintext structure into a TLSCompressed structure. Compression
functions are initialized with default state information whenever a
connection state is made active.

I will go ahead and see if we can take better advantage of ZLIB's stream
compression.

If it doesn't fit, I will simply compress my data one layer above SSL.

-
Eric





-----Original Message-----
From: David Schwartz [mailto:***@webmaster.com]
Sent: Tuesday, November 12, 2002 4:24 PM
To: openssl-***@openssl.org; openssl-***@openssl.org; Le Saux, Eric
Subject: RE: OpenSSL and compression using ZLIB
I believe Gregory Stark meant RFC2246.
Okay, but I don't see where RFC2246 says that the
compression/decompression
protocol can't have state or must compress each block independently or that
any particular compression protocol must be implemented in any particular
way.

DS
David Schwartz
2002-11-13 02:49:45 UTC
Permalink
Post by Le Saux, Eric
6.2.2. Record compression and decompression
[snip snip] The compression algorithm translates a
TLSPlaintext structure into a TLSCompressed structure. Compression
functions are initialized with default state information whenever a
connection state is made active.
The connection is active the whole time, isn't it? I don't see any language
to suggest that the connection becomes inactive between blocks.

IMO, the SSL engine should only force a sync from zlib when the input queue
empties. I see no reason it should ever reset the dictionary for as long as a
connection remains.

DS


______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Gregory Stark
2002-11-14 00:42:17 UTC
Permalink
Post by David Schwartz
Post by Le Saux, Eric
6.2.2. Record compression and decompression
[snip snip] The compression algorithm translates a
TLSPlaintext structure into a TLSCompressed structure. Compression
functions are initialized with default state information whenever a
connection state is made active.
The connection is active the whole time, isn't it? I don't see any
language
Post by David Schwartz
to suggest that the connection becomes inactive between blocks.
IMO, the SSL engine should only force a sync from zlib when the input
queue
Post by David Schwartz
empties. I see no reason it should ever reset the dictionary for as long
as a
Post by David Schwartz
connection remains.
Oops, I meant 2246. And reading it more carefully, I agree with your
interpretation. The dictionary need not be reset. Compression state can and
should be maintained across records. Did anyone do an rfc-draft for deflate
in tls?

______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Peter 'Luna' Runestig
2002-11-24 09:10:26 UTC
Permalink
Post by Gregory Stark
Oops, I meant 2246. And reading it more carefully, I agree with your
interpretation. The dictionary need not be reset. Compression state can and
should be maintained across records.
So, is anyone working on improving the zlib code according to these new
guidelines?

Cheers,
- Peter
--
Peter 'Luna' Runestig (fd. Altberg), Sweden <***@runestig.com>
PGP Key ID: 0xD07BBE13
Fingerprint: 7B5C 1F48 2997 C061 DE4B 42EA CB99 A35C D07B BE13
AOL Instant Messenger Screen name: PRunestig

______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Richard Levitte - VMS Whacker
2002-11-24 11:42:35 UTC
Permalink
In message <***@runestig.com> on Sun, 24 Nov 2002 10:10:26 +0100, "Peter 'Luna' Runestig" <peter+openssl-***@runestig.com> said:

peter+openssl-dev> Gregory Stark wrote:
peter+openssl-dev> > Oops, I meant 2246. And reading it more
peter+openssl-dev> > carefully, I agree with your interpretation. The
peter+openssl-dev> > dictionary need not be reset. Compression state
peter+openssl-dev> > can and should be maintained across records.
peter+openssl-dev>
peter+openssl-dev> So, is anyone working on improving the zlib code
peter+openssl-dev> according to these new guidelines?

Well, I've got a couple of issues with such a change:

1. Is OpenSSL really the only implementation that has ZLIB at all? I
believe there aren't any compression numbers defined for ZLIB yet
(are have they actually been defined by now?), so I guess it might
be tricky to implement in any case...
If OpenSSL isn't alone with ZLIB compression, perhaps we should
look at interoperability?

2. How does that affect communication with programs running older
versions of OpenSSL? I assume that a change in dictionary reseting
will also change the actual data that's resulting from compression.
Will that be a problem?
--
Richard Levitte \ Spannvägen 38, II \ ***@stacken.kth.se
***@Stacken \ S-168 35 BROMMA \ T: +46-8-26 52 47
\ SWEDEN \ or +46-708-26 53 44
Procurator Odiosus Ex Infernis -- ***@bofh.se
Member of the OpenSSL development team: http://www.openssl.org/

Unsolicited commercial email is subject to an archival fee of $400.
See <http://www.stacken.kth.se/~levitte/mail/> for more info.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Jeffrey Altman
2002-11-24 14:26:29 UTC
Permalink
http://www.ietf.org/internet-drafts/draft-ietf-tls-compression-03.txt

defines the compression numbers to be:

enum { null(0), ZLIB(1), LZS(2), (255) } CompressionMethod;

Therefore proposed numbers have been issued. I suggest that OpenSSL
define the CompressionMethod numbers to be:

enum { null(0), ZLIB(1), LZS(2), eayZLIB(224), eayRLE(225), (255) }
CompresssionMethod

as values in the range 193 to 255 are reserved for private use.

Where does the above draft state that the dictionary must be reset?
It states that the engine must be flushed but does not indicate that
the dictionary is to be reset. Resetting the dictionary would turn
ZLIB into a stateless compression algorithm and according to the draft
ZLIB is most certainly a stateful algorithm:

"the compressor maintains it's state through all compressed records"

I do not believe that compatibility will be an issue. It will simply
result in the possibility that the compressed data is distributed
differently among the TLS frames that make up the stream.

If compatibility is a issue we could implement a new variant of
COMP_zlib(); COMP_tls_zlib() that would be used with ZLIB(1).
Post by Richard Levitte - VMS Whacker
1. Is OpenSSL really the only implementation that has ZLIB at all? I
believe there aren't any compression numbers defined for ZLIB yet
(are have they actually been defined by now?), so I guess it might
be tricky to implement in any case...
If OpenSSL isn't alone with ZLIB compression, perhaps we should
look at interoperability?
2. How does that affect communication with programs running older
versions of OpenSSL? I assume that a change in dictionary reseting
will also change the actual data that's resulting from compression.
Will that be a problem?
--
\ SWEDEN \ or +46-708-26 53 44
Member of the OpenSSL development team: http://www.openssl.org/
Unsolicited commercial email is subject to an archival fee of $400.
See <http://www.stacken.kth.se/~levitte/mail/> for more info.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Jeffrey Altman * Volunteer Developer Kermit 95 2.1 GUI available now!!!
The Kermit Project @ Columbia University SSH, Secure Telnet, Secure FTP, HTTP
http://www.kermit-project.org/ Secured with MIT Kerberos, SRP, and
kermit-***@columbia.edu OpenSSL.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
pobox
2002-11-24 22:43:12 UTC
Permalink
----- Original Message -----
From: "Jeffrey Altman" <***@columbia.edu>
To: <openssl-***@openssl.org>
Cc: <openssl-***@openssl.org>; <peter+openssl-***@runestig.com>
Sent: Sunday, November 24, 2002 8:26 AM
Subject: Re: OpenSSL and compression using ZLIB
Post by Jeffrey Altman
http://www.ietf.org/internet-drafts/draft-ietf-tls-compression-03.txt
enum { null(0), ZLIB(1), LZS(2), (255) } CompressionMethod;
Therefore proposed numbers have been issued. I suggest that OpenSSL
enum { null(0), ZLIB(1), LZS(2), eayZLIB(224), eayRLE(225), (255) }
CompresssionMethod
as values in the range 193 to 255 are reserved for private use.
Where does the above draft state that the dictionary must be reset?
It states that the engine must be flushed but does not indicate that
the dictionary is to be reset. Resetting the dictionary would turn
ZLIB into a stateless compression algorithm and according to the draft
"the compressor maintains it's state through all compressed records"
I do not believe that compatibility will be an issue. It will simply
result in the possibility that the compressed data is distributed
differently among the TLS frames that make up the stream.
The draft clearly implies that the dictionary need not be reset and probably
should not be reset, but it is not clear to me that it prohibits this.
However, the draft talks about ...
"If TLS is not being used with a protocol that provides reliable, sequenced
packet delivery, the sender MUST flush the compressor completely" ...
I find this confusing because I've always understood that TLS assumes it is
running over just such a protocol. If I read it correctly, even EAP-TLS (RFC
2716) will handle sequencing, duped, and dropped packets before TLS
processing is invoked. So what's this clause alluding to?

In any event, I think I agree that the compressor can compatibly behave in
different ways as long as the decompressor doesn't care. I'm just not sure I
understand RFC1950 and 1951 well enough to know what is possible. Is "flush
the compressor completely" (as in the TLS compression draft language)
equivalent to compressing all the current data and emitting an end-of-block
code (value=256 in the language of RFC1951)? I'm guessing it is. Is
"resetting the dictionary" equivalent to compressing all the current data
and sending the block with the BFINAL bit set? If so, then it seems like the
decompressor can always react correctly and therefore compatibly in any of
the three cases. If the dictionary is reset for every record (current
OpenSSL behavior), then the decompressor knows this because the BFINAL bit
is set for every record. If the dictionary is not reset but is flushed for
every record, then the decompressor knows this because every record ends
with and end-of-block code. If the most optimal case is in play, which
implies a single uncompressed plaintext byte might be split across multiple
records, the decompressor can recognize and react properly to this case. If
all this is correct, then the next question is ...
What will the current implementation of thedecompressor in OpenSSL do in
each of these cases?


--greg
***@pobox.com

______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Richard Levitte - VMS Whacker
2002-11-24 23:10:50 UTC
Permalink
In message <001601c2940a$deed1b60$***@dell8200> on Sun, 24 Nov 2002 16:43:12 -0600, "pobox" <***@pobox.com> said:

ghstark> What will the current implementation of thedecompressor in
ghstark> OpenSSL do in each of these cases?

Unless this can be determined, it can be tested by having several
OpenSSLs with different behavior and test them against each other.

In any case, now that I know the numbers (yeah, I know, draft numbers,
but that's better than nothing), I can always put them in 0.9.8-dev
and try several algorithms (as was suggested, there's a private range,
and I see no harm in using them for tests, at least temporarly).
--
Richard Levitte \ Spannvägen 38, II \ ***@stacken.kth.se
***@Stacken \ S-168 35 BROMMA \ T: +46-8-26 52 47
\ SWEDEN \ or +46-708-26 53 44
Procurator Odiosus Ex Infernis -- ***@bofh.se
Member of the OpenSSL development team: http://www.openssl.org/

Unsolicited commercial email is subject to an archival fee of $400.
See <http://www.stacken.kth.se/~levitte/mail/> for more info.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Le Saux, Eric
2002-11-26 02:39:38 UTC
Permalink
In the current implementation of OpenSSL, compression/decompression state is
initialized and destroyed per record. It cannot possibly interoperate with
a compressor that maintains compression state across records. The
decompressor does care, unfortunately. The other way around could work,
though: a compressor that works per record, sending to a decompressor that
maintains state.

Personally I am adding a separate compression scheme that I called
COMP_streamzlib to the already existing COMP_zlib and COMP_rle methods
defined in OpenSSL. The only (but significant) difference is that it will
maintain the compression state across records. For the time being, I will
just use one of the private IDs mentionned in the previous emails (193 to
255), as it is not compatible with the current zlib/openssl compression.

Eric Le Saux
Electronic Arts

-----Original Message-----
From: pobox [mailto:***@pobox.com]
Sent: Sunday, November 24, 2002 2:43 PM
To: openssl-***@openssl.org
Cc: peter+openssl-***@runestig.com; ***@columbia.edu
Subject: Re: OpenSSL and compression using ZLIB

----- Original Message -----
From: "Jeffrey Altman" <***@columbia.edu>
To: <openssl-***@openssl.org>
Cc: <openssl-***@openssl.org>; <peter+openssl-***@runestig.com>
Sent: Sunday, November 24, 2002 8:26 AM
Subject: Re: OpenSSL and compression using ZLIB
Post by Jeffrey Altman
http://www.ietf.org/internet-drafts/draft-ietf-tls-compression-03.txt
enum { null(0), ZLIB(1), LZS(2), (255) } CompressionMethod;
Therefore proposed numbers have been issued. I suggest that OpenSSL
enum { null(0), ZLIB(1), LZS(2), eayZLIB(224), eayRLE(225), (255) }
CompresssionMethod
as values in the range 193 to 255 are reserved for private use.
Where does the above draft state that the dictionary must be reset?
It states that the engine must be flushed but does not indicate that
the dictionary is to be reset. Resetting the dictionary would turn
ZLIB into a stateless compression algorithm and according to the draft
"the compressor maintains it's state through all compressed records"
I do not believe that compatibility will be an issue. It will simply
result in the possibility that the compressed data is distributed
differently among the TLS frames that make up the stream.
The draft clearly implies that the dictionary need not be reset and probably
should not be reset, but it is not clear to me that it prohibits this.
However, the draft talks about ...
"If TLS is not being used with a protocol that provides reliable, sequenced
packet delivery, the sender MUST flush the compressor completely" ...
I find this confusing because I've always understood that TLS assumes it is
running over just such a protocol. If I read it correctly, even EAP-TLS (RFC
2716) will handle sequencing, duped, and dropped packets before TLS
processing is invoked. So what's this clause alluding to?

In any event, I think I agree that the compressor can compatibly behave in
different ways as long as the decompressor doesn't care. I'm just not sure I
understand RFC1950 and 1951 well enough to know what is possible. Is "flush
the compressor completely" (as in the TLS compression draft language)
equivalent to compressing all the current data and emitting an end-of-block
code (value=256 in the language of RFC1951)? I'm guessing it is. Is
"resetting the dictionary" equivalent to compressing all the current data
and sending the block with the BFINAL bit set? If so, then it seems like the
decompressor can always react correctly and therefore compatibly in any of
the three cases. If the dictionary is reset for every record (current
OpenSSL behavior), then the decompressor knows this because the BFINAL bit
is set for every record. If the dictionary is not reset but is flushed for
every record, then the decompressor knows this because every record ends
with and end-of-block code. If the most optimal case is in play, which
implies a single uncompressed plaintext byte might be split across multiple
records, the decompressor can recognize and react properly to this case. If
all this is correct, then the next question is ...
What will the current implementation of thedecompressor in OpenSSL do in
each of these cases?


--greg
***@pobox.com

______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Howard Chu
2002-11-26 05:00:41 UTC
Permalink
Post by Le Saux, Eric
-----Original Message-----
In the current implementation of OpenSSL,
compression/decompression state is
initialized and destroyed per record. It cannot possibly
interoperate with
a compressor that maintains compression state across records. The
decompressor does care, unfortunately.
This is surprising. I haven't looked at the code recently, but my experience
has been that a special bit sequence is emitted to signal a dictionary flush.
I haven't tested it either, so if you say it didn't work I believe you. But
plain old LZW definitely does not have this problem, the compressor can do
whatever it wants, and the decompressor will stay sync'd up because it
detects these reset codes.

-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
http://www.symas.com http://highlandsun.com/hyc
Symas: Premier OpenSource Development and Support

______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Jeffrey Altman
2002-11-26 03:36:36 UTC
Permalink
Eric:

Thanks for your feedback. Please contribute your code. I suggest you
use

eayZLIBstream(226)

- Jeff
Post by Le Saux, Eric
In the current implementation of OpenSSL, compression/decompression state is
initialized and destroyed per record. It cannot possibly interoperate with
a compressor that maintains compression state across records. The
decompressor does care, unfortunately. The other way around could work,
though: a compressor that works per record, sending to a decompressor that
maintains state.
Personally I am adding a separate compression scheme that I called
COMP_streamzlib to the already existing COMP_zlib and COMP_rle methods
defined in OpenSSL. The only (but significant) difference is that it will
maintain the compression state across records. For the time being, I will
just use one of the private IDs mentionned in the previous emails (193 to
255), as it is not compatible with the current zlib/openssl compression.
Eric Le Saux
Electronic Arts
-----Original Message-----
Sent: Sunday, November 24, 2002 2:43 PM
Subject: Re: OpenSSL and compression using ZLIB
----- Original Message -----
Sent: Sunday, November 24, 2002 8:26 AM
Subject: Re: OpenSSL and compression using ZLIB
Post by Jeffrey Altman
http://www.ietf.org/internet-drafts/draft-ietf-tls-compression-03.txt
enum { null(0), ZLIB(1), LZS(2), (255) } CompressionMethod;
Therefore proposed numbers have been issued. I suggest that OpenSSL
enum { null(0), ZLIB(1), LZS(2), eayZLIB(224), eayRLE(225), (255) }
CompresssionMethod
as values in the range 193 to 255 are reserved for private use.
Where does the above draft state that the dictionary must be reset?
It states that the engine must be flushed but does not indicate that
the dictionary is to be reset. Resetting the dictionary would turn
ZLIB into a stateless compression algorithm and according to the draft
"the compressor maintains it's state through all compressed records"
I do not believe that compatibility will be an issue. It will simply
result in the possibility that the compressed data is distributed
differently among the TLS frames that make up the stream.
The draft clearly implies that the dictionary need not be reset and probably
should not be reset, but it is not clear to me that it prohibits this.
However, the draft talks about ...
"If TLS is not being used with a protocol that provides reliable, sequenced
packet delivery, the sender MUST flush the compressor completely" ...
I find this confusing because I've always understood that TLS assumes it is
running over just such a protocol. If I read it correctly, even EAP-TLS (RFC
2716) will handle sequencing, duped, and dropped packets before TLS
processing is invoked. So what's this clause alluding to?
In any event, I think I agree that the compressor can compatibly behave in
different ways as long as the decompressor doesn't care. I'm just not sure I
understand RFC1950 and 1951 well enough to know what is possible. Is "flush
the compressor completely" (as in the TLS compression draft language)
equivalent to compressing all the current data and emitting an end-of-block
code (value=256 in the language of RFC1951)? I'm guessing it is. Is
"resetting the dictionary" equivalent to compressing all the current data
and sending the block with the BFINAL bit set? If so, then it seems like the
decompressor can always react correctly and therefore compatibly in any of
the three cases. If the dictionary is reset for every record (current
OpenSSL behavior), then the decompressor knows this because the BFINAL bit
is set for every record. If the dictionary is not reset but is flushed for
every record, then the decompressor knows this because every record ends
with and end-of-block code. If the most optimal case is in play, which
implies a single uncompressed plaintext byte might be split across multiple
records, the decompressor can recognize and react properly to this case. If
all this is correct, then the next question is ...
What will the current implementation of thedecompressor in OpenSSL do in
each of these cases?
--greg
______________________________________________________________________
OpenSSL Project http://www.openssl.org
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Jeffrey Altman * Volunteer Developer Kermit 95 2.1 GUI available now!!!
The Kermit Project @ Columbia University SSH, Secure Telnet, Secure FTP, HTTP
http://www.kermit-project.org/ Secured with MIT Kerberos, SRP, and
kermit-***@columbia.edu OpenSSL.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Le Saux, Eric
2002-11-26 18:24:17 UTC
Permalink
Again I want to clarify this point: the issue is in the way ZLIB is used by
OpenSSL, not in ZLIB itself. The compressor's state is built and destroyed
on every record because OpenSSL uses ZLIB's compress() call, which in turn
calls the lower-level deflateInit(), deflate() and deflateEnd() functions.

This ensures that the records are compression-independent from one another,
and the initial question that started this thread was about the existence of
any requirement in the definition of SSL that required such independence.

Most people discussing this point here do not believe there is such a
requirement, but I am not sure if we have a definitive opinion on this.
Some standards body will have to address that.

One thing is sure though: for specific applications where client and server
are under the control of the same developers, it does make sense to use ZLIB
differently when it is definitely known that the underlying protocol is
indeed reliable. That is why I am currently testing a very small addition
to OpenSSL's compression methods that I called streamzlib (I am considering
another name suggested yesterday on this mailing list). Some preliminary
tests with ZLIB showed that I can go from 2:1 compression factor to 6:1.

For completeness I must also say that for specific applications, compression
can be done just before and outside of the OpenSSL library. My personal
decision to push it down there is to avoid adding another encapsulation
layer in that part of our code that is written in C.

Now when compression within SSL matures, it will be necessary to have more
control over the compressor's operation than just turning it on. In ZLIB
you have the choice of 10 compression levels which trade-off between
compression quality and speed of execution. There are other options that
you could set, such as the size of the dictionary that you use. Future
compression methods supported by SSL will probably have their own different
set of options.

All this will be an excellent subject of discussion in some SSL standard
committee.

Cheers,

Eric Le Saux
Electronic Arts

-----Original Message-----
From: Howard Chu [mailto:***@highlandsun.com]
Sent: Monday, November 25, 2002 9:01 PM
To: openssl-***@openssl.org
Subject: RE: OpenSSL and compression using ZLIB
Post by Le Saux, Eric
-----Original Message-----
In the current implementation of OpenSSL,
compression/decompression state is
initialized and destroyed per record. It cannot possibly
interoperate with
a compressor that maintains compression state across records. The
decompressor does care, unfortunately.
This is surprising. I haven't looked at the code recently, but my experience
has been that a special bit sequence is emitted to signal a dictionary
flush.
I haven't tested it either, so if you say it didn't work I believe you. But
plain old LZW definitely does not have this problem, the compressor can do
whatever it wants, and the decompressor will stay sync'd up because it
detects these reset codes.

-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
http://www.symas.com http://highlandsun.com/hyc
Symas: Premier OpenSource Development and Support

______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Geoff Thorpe
2002-11-26 19:00:43 UTC
Permalink
Salut Eric,

Thanks for describing what you're up to and thanks (in advance) for
contributing your implementation(s). OpenSSL is used for a lot more than
building free webservers, despite misconceptions to the contrary, and
having an reasonably-optimal zlib compression layer right there in the
SSL/TLS implementation will be useful to many people (and for some, as an
unexpected and no-hassle bonus to their apps).
Post by Le Saux, Eric
All this will be an excellent subject of discussion in some SSL
standard committee.
Standards ... ah yes, where "the customer is always wrong". I dare suggest
that the best way forward in that respect is to get a widely used SSL/TLS
implementation supporting compression in a sensible and tried-and-tested
manner, let it become a de-facto standard, then let standards authors
grumble over who'll get to backfit some RFC to it. At least that way
around, the dog wags the tail I suppose ... :-)

Cheers,
Geoff
--
Geoff Thorpe
***@geoffthorpe.net
http://www.geoffthorpe.net/


______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
David Rees
2002-11-26 20:08:34 UTC
Permalink
Post by Geoff Thorpe
Thanks for describing what you're up to and thanks (in advance) for
contributing your implementation(s). OpenSSL is used for a lot more
than building free webservers, despite misconceptions to the contrary,
and having an reasonably-optimal zlib compression layer right there in
the SSL/TLS implementation will be useful to many people (and for
some, as an unexpected and no-hassle bonus to their apps).
I for one would love to see SSL/TLS implementations support a standard
compression standard, especially for the purpose of serving web pages.
At this point in time, there aren't any good options for compressing
encrypted content if you want to use Apache and support a wide array of
clients. Especially if you want to compress SSL data.

Does anyone know if any browsers out there support at all for zlib
compression under SSL/TLS?

-Dave
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Pablo J Royo
2002-11-27 08:26:48 UTC
Permalink
I have used ZLIB in several projects, but my knowledge of it it´s not as
deep as yours, but...aren't you talking about a simple BIO for compressing
data?.(Or,probably, I missed something in this discussion thread?)
I think the BIO would mantain the context (as z_stream struct of ZLIB do)
among several calls to BIO_write/read, so if you want to compress
communication data you have to chain this "zBIO" with a socket BIO.
Some disccusion and solution on this can be found here

http://marc.theaimsgroup.com/?l=openssl-dev&m=99927148415628&w=2

I have used that to compress/cipher/base64 big files with chained BIOs (and
a similar implementation of zBIO showed there) and it works, so may be it
would work one step more with sockets BIOs.


----- Original Message -----
From: "Le Saux, Eric" <***@origin.ea.com>
To: <openssl-***@openssl.org>
Sent: Tuesday, November 26, 2002 7:24 PM
Subject: RE: OpenSSL and compression using ZLIB
Post by Le Saux, Eric
Again I want to clarify this point: the issue is in the way ZLIB is used by
OpenSSL, not in ZLIB itself. The compressor's state is built and destroyed
on every record because OpenSSL uses ZLIB's compress() call, which in turn
calls the lower-level deflateInit(), deflate() and deflateEnd() functions.
This ensures that the records are compression-independent from one another,
and the initial question that started this thread was about the existence of
any requirement in the definition of SSL that required such independence.
Most people discussing this point here do not believe there is such a
requirement, but I am not sure if we have a definitive opinion on this.
Some standards body will have to address that.
One thing is sure though: for specific applications where client and server
are under the control of the same developers, it does make sense to use ZLIB
differently when it is definitely known that the underlying protocol is
indeed reliable. That is why I am currently testing a very small addition
to OpenSSL's compression methods that I called streamzlib (I am considering
another name suggested yesterday on this mailing list). Some preliminary
tests with ZLIB showed that I can go from 2:1 compression factor to 6:1.
For completeness I must also say that for specific applications, compression
can be done just before and outside of the OpenSSL library. My personal
decision to push it down there is to avoid adding another encapsulation
layer in that part of our code that is written in C.
Now when compression within SSL matures, it will be necessary to have more
control over the compressor's operation than just turning it on. In ZLIB
you have the choice of 10 compression levels which trade-off between
compression quality and speed of execution. There are other options that
you could set, such as the size of the dictionary that you use. Future
compression methods supported by SSL will probably have their own different
set of options.
All this will be an excellent subject of discussion in some SSL standard
committee.
Cheers,
Eric Le Saux
Electronic Arts
-----Original Message-----
Sent: Monday, November 25, 2002 9:01 PM
Subject: RE: OpenSSL and compression using ZLIB
Post by Le Saux, Eric
-----Original Message-----
In the current implementation of OpenSSL,
compression/decompression state is
initialized and destroyed per record. It cannot possibly
interoperate with
a compressor that maintains compression state across records. The
decompressor does care, unfortunately.
This is surprising. I haven't looked at the code recently, but my experience
has been that a special bit sequence is emitted to signal a dictionary
flush.
I haven't tested it either, so if you say it didn't work I believe you. But
plain old LZW definitely does not have this problem, the compressor can do
whatever it wants, and the decompressor will stay sync'd up because it
detects these reset codes.
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
http://www.symas.com http://highlandsun.com/hyc
Symas: Premier OpenSource Development and Support
______________________________________________________________________
OpenSSL Project http://www.openssl.org
______________________________________________________________________
OpenSSL Project http://www.openssl.org
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Le Saux, Eric
2002-11-27 17:33:59 UTC
Permalink
Yes, very interesting.

This is another way of adding compression to the data pipe.
I have not looked at the code, but I assume that the compression state is
maintained for the whole life of the communication channel, which is what
gives the best results.

Have you tried to use SSL_COMP_add_compression_method() also?

Cheers,

Eric Le Saux
Electronic Arts



-----Original Message-----
From: Pablo J Royo [mailto:***@tb-solutions.com]
Sent: Wednesday, November 27, 2002 12:27 AM
To: openssl-***@openssl.org
Subject: Re: OpenSSL and compression using ZLIB


I have used ZLIB in several projects, but my knowledge of it it´s not as
deep as yours, but...aren't you talking about a simple BIO for compressing
data?.(Or,probably, I missed something in this discussion thread?)
I think the BIO would mantain the context (as z_stream struct of ZLIB do)
among several calls to BIO_write/read, so if you want to compress
communication data you have to chain this "zBIO" with a socket BIO.
Some disccusion and solution on this can be found here

http://marc.theaimsgroup.com/?l=openssl-dev&m=99927148415628&w=2

I have used that to compress/cipher/base64 big files with chained BIOs (and
a similar implementation of zBIO showed there) and it works, so may be it
would work one step more with sockets BIOs.


----- Original Message -----
From: "Le Saux, Eric" <***@origin.ea.com>
To: <openssl-***@openssl.org>
Sent: Tuesday, November 26, 2002 7:24 PM
Subject: RE: OpenSSL and compression using ZLIB
Post by Le Saux, Eric
Again I want to clarify this point: the issue is in the way ZLIB is used
by
Post by Le Saux, Eric
OpenSSL, not in ZLIB itself. The compressor's state is built and
destroyed
Post by Le Saux, Eric
on every record because OpenSSL uses ZLIB's compress() call, which in turn
calls the lower-level deflateInit(), deflate() and deflateEnd() functions.
This ensures that the records are compression-independent from one
another,
Post by Le Saux, Eric
and the initial question that started this thread was about the existence
of
Post by Le Saux, Eric
any requirement in the definition of SSL that required such independence.
Most people discussing this point here do not believe there is such a
requirement, but I am not sure if we have a definitive opinion on this.
Some standards body will have to address that.
One thing is sure though: for specific applications where client and
server
Post by Le Saux, Eric
are under the control of the same developers, it does make sense to use
ZLIB
Post by Le Saux, Eric
differently when it is definitely known that the underlying protocol is
indeed reliable. That is why I am currently testing a very small addition
to OpenSSL's compression methods that I called streamzlib (I am
considering
Post by Le Saux, Eric
another name suggested yesterday on this mailing list). Some preliminary
tests with ZLIB showed that I can go from 2:1 compression factor to 6:1.
For completeness I must also say that for specific applications,
compression
Post by Le Saux, Eric
can be done just before and outside of the OpenSSL library. My personal
decision to push it down there is to avoid adding another encapsulation
layer in that part of our code that is written in C.
Now when compression within SSL matures, it will be necessary to have more
control over the compressor's operation than just turning it on. In ZLIB
you have the choice of 10 compression levels which trade-off between
compression quality and speed of execution. There are other options that
you could set, such as the size of the dictionary that you use. Future
compression methods supported by SSL will probably have their own
different
Post by Le Saux, Eric
set of options.
All this will be an excellent subject of discussion in some SSL standard
committee.
Cheers,
Eric Le Saux
Electronic Arts
-----Original Message-----
Sent: Monday, November 25, 2002 9:01 PM
Subject: RE: OpenSSL and compression using ZLIB
Post by Le Saux, Eric
-----Original Message-----
In the current implementation of OpenSSL,
compression/decompression state is
initialized and destroyed per record. It cannot possibly
interoperate with
a compressor that maintains compression state across records. The
decompressor does care, unfortunately.
This is surprising. I haven't looked at the code recently, but my
experience
Post by Le Saux, Eric
has been that a special bit sequence is emitted to signal a dictionary
flush.
I haven't tested it either, so if you say it didn't work I believe you.
But
Post by Le Saux, Eric
plain old LZW definitely does not have this problem, the compressor can do
whatever it wants, and the decompressor will stay sync'd up because it
detects these reset codes.
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
http://www.symas.com http://highlandsun.com/hyc
Symas: Premier OpenSource Development and Support
______________________________________________________________________
OpenSSL Project http://www.openssl.org
______________________________________________________________________
OpenSSL Project http://www.openssl.org
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Geoff Thorpe
2002-11-27 19:58:24 UTC
Permalink
Post by Le Saux, Eric
Yes, very interesting.
This is another way of adding compression to the data pipe.
I have not looked at the code, but I assume that the compression state
is maintained for the whole life of the communication channel, which is
what gives the best results.
Um, out of curiosity ... wouldn't this be the easiest way to implement a
custom compression method anyhow? Ie. define the compression method so
that the SSL/TLS handshake can take care of agreeing (or not) about
compression at each end, but do not implement the method inside SSL/TLS
processing - ie. if that compression method is agreed, cause a zlib BIO
to be inserted (or removed, in the case of a renegotiation perhaps) onto
the application side of the SSL object's BIO chain (um, actually
"chains", one each for read and write I suppose) ... this essentially
does what Pablo was referring to but lets the SSL/TLS handshake take care
of negotiating compression with the peer. The latter is the problem with
just putting the compression layer inside the SSL/TLS layer, you need an
out-of-band (read: application) mechanism to decide when to use it or
not.

It sounds a bit magic(k) though ... hmm ... perhaps buffering/flushes
would be the problem when applications use non-blocking sockets? If not,
this sounds easier than putting the zlib manipulation inside the SSL/TLS
layer and would probably give faster and better compression too.

oh yes: Pablo J Royo wrote;
Post by Le Saux, Eric
I think the BIO would mantain the context (as z_stream struct of ZLIB
do) among several calls to BIO_write/read, so if you want to compress
communication data you have to chain this "zBIO" with a socket BIO.
almost - presumably the socket BIO you refer to is on the SSL/TLS data
side rather than the application data side, in which case your
compression won't do much. Compression is only useful on the assumption
that the application data itself is compressible, and by the time you get
SSL/TLS data - it's (hopefully) too well encrypted for compression to
have much effect. :-) I assume you ment to chain it with a memory/buffer
BIO? Ie. going from;

--> write_BIO --> >-- \
[app] [SSL] socket_BIO
<-- read_BIO <-- <-- /

to;

--> write_BIO --> zlib_BIO --> >--\
[app] [SSL] socket_BIO
<-- read_BIO <-- zlib_BIO <-- <--/

?

Cheers,
Geoff
--
Geoff Thorpe
***@geoffthorpe.net
http://www.geoffthorpe.net/


______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Kenneth R. Robinette
2002-11-27 20:24:47 UTC
Permalink
Date sent: Wed, 27 Nov 2002 14:58:24 -0500
From: Geoff Thorpe <***@geoffthorpe.net>
Subject: Re: OpenSSL and compression using ZLIB
To: openssl-***@openssl.org
Copies to: "Le Saux, Eric" <***@origin.ea.com>, ***@tb-
solutions.com
Send reply to: openssl-***@openssl.org

Um, well that's one approach. But its a little like saying "Lets let SSL/TLS take care
of agreeing on a cipher type, and then leave it up to the user application to take care
of the actual encryption/decrytion. I would rather see the most commonly used
methods inplemented within SSL/TLS itself.

Ken
Post by Le Saux, Eric
Yes, very interesting.
This is another way of adding compression to the data pipe.
I have not looked at the code, but I assume that the compression state
is maintained for the whole life of the communication channel, which is
what gives the best results.
Um, out of curiosity ... wouldn't this be the easiest way to implement a
custom compression method anyhow? Ie. define the compression method so
that the SSL/TLS handshake can take care of agreeing (or not) about
compression at each end, but do not implement the method inside SSL/TLS
processing - ie. if that compression method is agreed, cause a zlib BIO
to be inserted (or removed, in the case of a renegotiation perhaps) onto
the application side of the SSL object's BIO chain (um, actually
"chains", one each for read and write I suppose) ... this essentially
does what Pablo was referring to but lets the SSL/TLS handshake take care
of negotiating compression with the peer. The latter is the problem with
just putting the compression layer inside the SSL/TLS layer, you need an
out-of-band (read: application) mechanism to decide when to use it or
not.

It sounds a bit magic(k) though ... hmm ... perhaps buffering/flushes
would be the problem when applications use non-blocking sockets? If not,
this sounds easier than putting the zlib manipulation inside the SSL/TLS
layer and would probably give faster and better compression too.

oh yes: Pablo J Royo wrote;
Post by Le Saux, Eric
I think the BIO would mantain the context (as z_stream struct of ZLIB
do) among several calls to BIO_write/read, so if you want to compress
communication data you have to chain this "zBIO" with a socket BIO.
almost - presumably the socket BIO you refer to is on the SSL/TLS data
side rather than the application data side, in which case your
compression won't do much. Compression is only useful on the assumption
that the application data itself is compressible, and by the time you get
SSL/TLS data - it's (hopefully) too well encrypted for compression to
have much effect. :-) I assume you ment to chain it with a memory/buffer
BIO? Ie. going from;

--> write_BIO --> >-- \
[app] [SSL] socket_BIO
<-- read_BIO <-- <-- /

to;

--> write_BIO --> zlib_BIO --> >--\
[app] [SSL] socket_BIO
<-- read_BIO <-- zlib_BIO <-- <--/

?

Cheers,
Geoff
--
Geoff Thorpe
***@geoffthorpe.net
http://www.geoffthorpe.net/


___________________________________________________________________
___
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager
***@openssl.org_______________________________________________
___
Support
InterSoft International, Inc.
Voice: 888-823-1541, International 281-398-7060
Fax: 888-823-1542, International 281-398-0221
***@securenetterm.com
http://www.securenetterm.com

______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Geoff Thorpe
2002-11-27 21:44:05 UTC
Permalink
Post by Kenneth R. Robinette
Um, well that's one approach. But its a little like saying "Lets let
SSL/TLS take care of agreeing on a cipher type, and then leave it up to
the user application to take care of the actual encryption/decrytion.
I would rather see the most commonly used methods inplemented within
SSL/TLS itself.
If the SSL/TLS implementation is doing the (de)compression I don't see
what your point is. Ie. with this compression method negotiated by the
client and server, the SSL/TLS would still be responsible for handling
compression - it would just handle it on the application data before
applying the SSL/TLS framing rather than compressing data inside it. From
the application point of view, there's no need to implement anything. Did
you misunderstand me or vice versa?

Cheers,
Geoff
--
Geoff Thorpe
***@geoffthorpe.net
http://www.geoffthorpe.net/


______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
pobox
2002-11-29 21:22:18 UTC
Permalink
Post by Geoff Thorpe
Post by Kenneth R. Robinette
Um, well that's one approach. But its a little like saying "Lets let
SSL/TLS take care of agreeing on a cipher type, and then leave it up to
the user application to take care of the actual encryption/decrytion.
I would rather see the most commonly used methods inplemented within
SSL/TLS itself.
If the SSL/TLS implementation is doing the (de)compression I don't see
what your point is. Ie. with this compression method negotiated by the
client and server, the SSL/TLS would still be responsible for handling
compression - it would just handle it on the application data before
applying the SSL/TLS framing rather than compressing data inside it. From
the application point of view, there's no need to implement anything. Did
you misunderstand me or vice versa?
Geoff,

I can't speak for Kenneth, but I'm not sure I get what you're saying
here. The data is first compressed and then encrypted according to
RFC2246. In my mind, once the application hands the data to OpenSSL
via SSL_write() or BIO_write() or _puts() or whatever it is no longer
application data, even if compression has been negotiated.

I think it is best to firstly get the decompressor correct. My belief
is that a single decompressor can transparently handle the following
three possible compression scenarios:

1) Each record is compressed independently. The dictionary is reset
before each record. This appears to be the way OpenSSL currently works
(flush is Z_FULL_FLUSH). Compression ratio is worst of the three.

2) The dictionary is not reset between records. However, the current
compression buffer can be flushed (Z_SYNC_FLUSH), so that uncompressed
data does not span an SSL record boundary. Compression ratio is better
than #1.

3) The compression buffer is not flushed between records. Uncompressed
data may span SSL record boundaries. Best compression ratio.

#1 is the 'safest' in that it seems to make compression as
transparently to client applications as is possible. #2 is almost as
safe. For the most part, #2 will be just as safe as #1. In fact, I
can't really think of any reasonable scenarios in which this is not
true, but strange things are possible with acceleraters, proxies,
shims and whatnot. At least #2 is absolutely necessary, e.g. for
client protocols like EAP-TLS.

A decompressor that has this functionality would be backward
compatible with the current OpenSSL scheme and forward compatible with
almost any reasonable implementation of ZLIB over TLS.




______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Kenneth R. Robinette
2002-11-29 22:13:01 UTC
Permalink
From: "pobox" <***@pobox.com>
To: <openssl-***@openssl.org>
Subject: Re: OpenSSL and compression using ZLIB
Date sent: Fri, 29 Nov 2002 15:22:18 -0600
Send reply to: openssl-***@openssl.org

I was not sure either, and perhaps I did not take the time to completely understand
what Geoff was really saying. If so, I apologize. However I do agree with the
statements below.

The other thing that concerns me a little is the use of a zlib dll in the Microsoft
Windows environment. OpenSSL 0.9.7 makes use a zlib dll, although it could be
optional. The specs are not really clear in this area, and I have always had to mess
around with the final makefile to get compression and the Kerberos option to even
work. I have been burned several times with the use of zlib dll's, since quite a few
Windows based programs distribute zlib dll's and not all are compatible, but all
appear to have the same exports. However, this is not an issue if in there is an
option to use static zlib functions linkage.

Getting the OpenSSL zlib support to operate like the statements below, with the
ability the specify a compression level would allow us, and other Windows based
developers of SSH clients, get rid of some redundent logic. Most developers of
these clients already use the OpenSSL EVP cipher support. I suspect most UNIX
SSH developers do also. We also use zlib compression within our SSL/TLS based
telnet client which communicates mainly with the SRP project telnet server.

Ken
Post by Geoff Thorpe
Post by Kenneth R. Robinette
Um, well that's one approach. But its a little like saying "Lets let
SSL/TLS take care of agreeing on a cipher type, and then leave it up to
the user application to take care of the actual encryption/decrytion.
I would rather see the most commonly used methods inplemented within
SSL/TLS itself.
If the SSL/TLS implementation is doing the (de)compression I don't see
what your point is. Ie. with this compression method negotiated by the
client and server, the SSL/TLS would still be responsible for handling
compression - it would just handle it on the application data before
applying the SSL/TLS framing rather than compressing data inside it. From
the application point of view, there's no need to implement anything. Did
you misunderstand me or vice versa?
Geoff,

I can't speak for Kenneth, but I'm not sure I get what you're saying
here. The data is first compressed and then encrypted according to
RFC2246. In my mind, once the application hands the data to OpenSSL
via SSL_write() or BIO_write() or _puts() or whatever it is no longer
application data, even if compression has been negotiated.

I think it is best to firstly get the decompressor correct. My belief
is that a single decompressor can transparently handle the following
three possible compression scenarios:

1) Each record is compressed independently. The dictionary is reset
before each record. This appears to be the way OpenSSL currently works
(flush is Z_FULL_FLUSH). Compression ratio is worst of the three.

2) The dictionary is not reset between records. However, the current
compression buffer can be flushed (Z_SYNC_FLUSH), so that uncompressed
data does not span an SSL record boundary. Compression ratio is better
than #1.

3) The compression buffer is not flushed between records. Uncompressed
data may span SSL record boundaries. Best compression ratio.

#1 is the 'safest' in that it seems to make compression as
transparently to client applications as is possible. #2 is almost as
safe. For the most part, #2 will be just as safe as #1. In fact, I
can't really think of any reasonable scenarios in which this is not
true, but strange things are possible with acceleraters, proxies,
shims and whatnot. At least #2 is absolutely necessary, e.g. for
client protocols like EAP-TLS.

A decompressor that has this functionality would be backward
compatible with the current OpenSSL scheme and forward compatible with
almost any reasonable implementation of ZLIB over TLS.




___________________________________________________________________
___
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager
***@openssl.org_______________________________________________
___
Support
InterSoft International, Inc.
Voice: 888-823-1541, International 281-398-7060
Fax: 888-823-1542, International 281-398-0221
***@securenetterm.com
http://www.securenetterm.com

______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Geoff Thorpe
2002-12-01 17:37:13 UTC
Permalink
Hi there,
Post by pobox
Geoff,
I can't speak for Kenneth, but I'm not sure I get what you're saying
here. The data is first compressed and then encrypted according to
RFC2246. In my mind, once the application hands the data to OpenSSL
via SSL_write() or BIO_write() or _puts() or whatever it is no longer
application data, even if compression has been negotiated.
I think it is best to firstly get the decompressor correct. My belief
is that a single decompressor can transparently handle the following
[snip]

I think my point ended up running orthogonally to the discussion taking
place :-) I was speaking from an implementation perspective and
suggesting that the compression and decompression *defined* and in the
custom SSL/TLS compression methods could have its *implementation*
completely decoupled from the SSL/TLS record layer code.

It seems SSL/TLS compression is supposed to compress the data prior to
encrypting/MACing, so there seems little point implementing those
compression details inside that SSL/TLS record layer code. With OpenSSL's
SSL/TLS implementation relying heavily on BIOs already (be that for
better or worse), it seems that a very simple and possibly optimal way of
implementing compression would be simply to plonk two compression BIOs in
the appropriate chains whenever our "custom" SSL/TLS compression method
is negotiated at the SSL/TLS level. The alternative is to have various
bits and pieces of zlib logic splattered throughout the SSL/TLS code
despite the fact that it seems to do little more than just filter the
application data through zlib.

With respect to your 3 forms - they sort of collapse into one under this
scheme though could be kept separated via BIO_ctrl()s if desired. Eg. if
you define more than one compression method to handle different flushing
and/or dictionary semantics, you could make the corresponding "BIO hooks"
in the SSL/TLS handshake code just prep the (de)compression BIOs with an
appropriate BIO_ctrl(). Ideally, "regular" compression should work OK and
the flushing should mirror how applications see "regular" SSL/TLS
already. Ie. if [BIO|SSL]_write(10 bytes of data) normally goes straight
out and generates a new SSL record, then the compression case should
compress the 10 bytes before doing the same. If applications and/or the
SSL/TLS implementation uses buffering under certain circumstances, that
should take effect at the compression layer too (ie. trying to reduce
fragmentation and SSL/TLS bandwidth overhead should likewise improve
compression quality by compressing larger blocks). There should be no
reason to reset the compressor state in *any* way, as SSL/TLS already
stipulates the requirement for an ordered and reliable data stream.
However forcing such behaviour should be easy enough for compatibility
purposes if required.

(Note this approach keeps compression code in BIOs without duplicating it
in ssl/, so applications can use the BIOs independantly too. Also, new
compression methods are easier to add - eg. define a libbzip2-based BIO
and add a new compression id+hook in the SSL/TLS code).

Cheers,
Geoff
--
Geoff Thorpe
***@geoffthorpe.net
http://www.geoffthorpe.net/

Strange yet thought-provoking time of year;
Muslims spend a month of Rammadan, fasting and reflecting.
Americans celebrate liberty with a day full of eating.

______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Pablo J Royo
2002-12-02 08:19:20 UTC
Permalink
Post by Geoff Thorpe
(Note this approach keeps compression code in BIOs without duplicating it
in ssl/, so applications can use the BIOs independantly too. Also, new
compression methods are easier to add - eg. define a libbzip2-based BIO
and add a new compression id+hook in the SSL/TLS code).
I agree with this.
I´ve been several years using ZLIB to compress big files mixing ZLIB code
with pkcs7 code in OpenSSL, and a ZBIO would be very useful. I think there
is a lot of messages in OpenSSL user-list asking for something similar to
this to handle big files, so I think all that people would also benefit of
this zBIO.
Also, OpenSSL is a big sized library now, so if separate ZLIB code must be
in libeay32.dll ans ssleay32.dll a lot of applications (in windows
specially) would be greater, and with Java around the cost of downloading
binaries and the size of executables must be carefully watched.
I´m aware I´m talking as a user here, and this is a developers list...

Pablo J. Royo




______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Pablo Royo Moreno
2002-11-28 20:39:36 UTC
Permalink
Post by Geoff Thorpe
The latter is the problem with
just putting the compression layer inside >the SSL/TLS layer, you need an
out-of-band (read: application) mechanism >to decide when to use it or
not.
I must admit I didn´t think in this problem when I posted my message (I´m not an expert :-( ), because I have used this kind of BIO only with PKCS7 files and then PKCS7 attribute "Type Of ContentData" (I dont remember exact OID) gives that "out-of-band" decission telling me how to read the file when I receive it, uncompress or not.
I suppose SSL has some way to extent negotiation to fit future new features, so this "out-of-band" could fit there.
Post by Geoff Thorpe
perhaps buffering/flushes
would be the problem when applications >use non-blocking sockets?
I think ZLIB allows partial reads/writes with its internal enqueing mechanics, as you can see looking at gz_read( )/gz_write( ) functions. In fact, what this functions do is very similar to what BIOs do to handle partial reads/writes (to wait another invocation and then see if stream can go on)
In the implementation I mentioned before, this logic is taken outside gz_read/gz_write and put on a BIO.
So I suppose that if a socket return less bytes than expected this layer would return not the new bytes received, but the already avilable uncompressed or nothing until a new uncompress can take place.
I'm not sure, but Is not this way how BIOs work?. Then, this is the same and should work.
About flush, I just can say it works whit file,zBIO,MD5,Cipher,base64 bios all chained together.
Post by Geoff Thorpe
I assume you ment to chain it with a >memory/buffer
BIO? Ie. going from;
--> write_BIO --> >-- \
[app] [SSL] socket_BIO
<-- read_BIO <-- <-- /
to;
--> write_BIO --> zlib_BIO --> >--\
[app] [SSL] socket_BIO
<-- read_BIO <-- zlib_BIO <-- <--/
?
Yes , the first BIO must be something taking "plane compressibe data" that the second "zBIO" can easily compress.



______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List openssl-***@openssl.org
Automated List Manager ***@openssl.org
Loading...