niedziela, 10 czerwca 2012

First progress report

    As I've mentioned in previous post my port of pseudo-tcp to java is at advanced stage. In this post I'll try to cover most of things which has been done until now. By the way - project's source code is published at http://code.google.com/p/pseudo-tcp-java-impl/

    I started work on this project by the end of April when accepted student's list was published. I've created overall plan how I'm going to acomplish my task:
  1. Create some test code in C++ to test against libjingle's implementation and research how the protocol works (done already while applying)
  2. Establish packet level communication with C++ pseudotcp (ensure that header fields transfer correctly, byte order etc.)
  3. Finish refactoring code and transfer some data between Java and C++
  4. Write unit tests and make sure that all pass. Full set of test also exists in libjingle so they require porting.
  5. Test performance and compare it with C++.
  6. Create final Java interface for the protocol and integrate it with ice4j. A diagram with classes overview from my GSoC application can be found here. Now they have different naming, but the concept is almost the same.
The beginning

    As a starting point I've copy pasted PseudoTcp class directly from C++. In this class there is all protocol's logic and as I've mentioned in previous post it requires specific environment to run correctly. I had to create one thread for tracking time changes and another to handle network socket.
    My goal was to exchange some sample packets and the first problem I encountered was that there aren't unsigned data types in Java. To handle header fields which are type of unsigned int 32 I use long primitive Java type. This allows to perform any calculations on this kind of fields. To transfer them through the network there are required special routines which will write and read them from byte buffers. For example function which stores long in a buffer may look something like this:

void long_to_bytes(long uInt, byte[] buf, int offset)
{
 buf[offset] = (byte) (( uInt  & 0xFF000000L) >>> 24);
 buf[offset + 1] = (byte) (( uInt  & 0x00FF0000L) >>> 16);
 buf[offset + 2] = (byte) (( uInt  & 0x0000FF00L) >>> 8);
 buf[offset + 3] = (byte) (( uInt  & 0x000000FFL));
}
    There is also available special buffer class in java.nio called ByteBuffer, but it worked only in one direction and finally made no use for me. At first it looked very interesting as it has also included option to specify the byte order. I didn't knew by then that Java have network byte order by default.

PseudoTcp header

//////////////////////////////////////////////////////////////////////
//
//    0                   1                   2                   3
//    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
//  0 |                      Conversation Number                      |
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
//  4 |                        Sequence Number                        |
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
//  8 |                     Acknowledgment Number                     |
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
//    |               |   |U|A|P|R|S|F|                               |
// 12 |    Control    |   |R|C|S|S|Y|I|            Window             |
//    |               |   |G|K|H|T|N|N|                               |
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
// 16 |                       Timestamp sending                       |
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
// 20 |                      Timestamp receiving                      |
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
// 24 |                             data                              |
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
//
//////////////////////////////////////////////////////////////////////

  • Conversation Number - identifies current conversation 
  • Sequence Number - info about the place of current packet in whole  conversation's data stream
  • Acknowledgment Number - idicates up to what sequence number receivied data has been acknowledged
  • Control - one byte which identifies control action, currently only "connect" is used
  • Window - bytes count that receivier is able to accept
  • Timestamp sending - clock count in milliseconds idicates when packet was sent
  • Timestamp receiving - clock count shows when we have receivied last packet from remote side
FifoBuffer

  After packets were transferred sucessfully and header fields were properly parsed I moved on to data transfer. I spent most time on dealing with the fifo buffer class. Current protocol implementation uses one for storing receivied data and one to queue data which is waiting to be sent. It works like a queue of bytes, but has also few specific functions. For example writing or reading at specified offset from current buffer's position, but without affecting final buffer's position. It may be used to store packets which are delivered out of order.
    My first implementation used java.nio.ByteBuffer internally. This class has ability only to write to the buffer or read from it at the same time. I kept track of read position not to rewind the buffer at every single read operation, but when it reached one half of total buffer's size. But it showed up to cause sending too small window to remote side because buffer's space was cleared too rarely. I discovered it only when window unit tests were failing. Increasing the frequency showed decent decrease in performance.
    Second one which I called EfficentFifoBuffer :) was based on a single byte array. I stored read and write positions and used them to track available space and buffered data. They loop through the array with use of "modulo length" operation. This required much more work to have this working, but resolved performance issues.

Unit tests

    Porting of the unit tests also required plenty of work as in libjingle there is some kind of custom threading model. Many operations are performed by posting some messages to the threads which they were processing. There is also an option to send delayed messages which were emulating delayed data packets in the unit tests. 

What next ?

    At the moment almost all tests are completed except only one window test which I plan to finish later as I don't have any easy solution to make it working by now. I work on finishing class PseudoTcpSocket which is public interface of the protocol. It uses DatagramSockets to communicate through the network and controls required threads. Maybe it would be nice idea to use some interface instead of DatagramSocket directly. It would enable porting this code for example to Java ME which has different class to handling UDP datagrams. Also I have to discuss final interface on the dev list and determine how this will be integrated with ice4j library.

poniedziałek, 4 czerwca 2012

Hello there !

After all efforts I've put into my application for GSoC 2012 I've managed to get involved in this event. My host organization is Jitsi.org under umbrella of XMPP Standards Foundation. The project that I work on is pseudo-tcp protocol implementation in Java. It's about reliable and controlled data transfer, but lightweight at the same time. This can be achived by using UDP as a transport layer and implementing only the most essantial parts of TCP specification. In fact the protocol is already implemented in Google's libjingle in C++. My implementation must be fully comapatible with this one.

My work has started when I was writing my application. I've spent a few days on digging into libjingle's code and documentation to build it and run. After that I realized that it wasn't so simple to run the protocol itself on some sockets. It was hidden below few other layers which were no use for me and it was impossible to get into it in such a small amount of time I had. There was available strictly protocol's logic class which requires specific environment to run. It exposes interface to get notified about progress in time and data receivied from another side. So in fact there must be one thread to track time and another to handle socket. It may be not much for someone who writes in C++ everyday, but I had some problems with that as I used it some time ago only for small projects during studies. Finally I've managed to create some code which transferred data between two UDP sockets.

The other task I worked on while applying for the project was to find out how the protocol may be integrated with ice4j library(ICE protocol for Java). So I had read introduction parts of ICE's RFC and tried to understand how it works. As I understood it's about choosing the best connection option of all that are available. When we are connecting using some application through the internet there are usually few routers on the way and our external ip address is different that the local one. Sometimes we can be accessed by local or external address or by using STUN server or maybe something else. ICE is there to choose the best option for us.

My first idea was to add new transport type to ICE library like there are UDP and TCP already. Later I thought that maybe there's no need for that and it would be enough when we will only discover UDP connectivity (there would be no checks if there's pseudo tcp on another side). Then just use UDP adresses to establish pseudo-tcp connection. In this case we should decide on some higher level that we are going to use pseudo-tcp on that sockets. However I'm still not sure which approach should I use and I'll decide about it later when the protocol's implementation will be finished. Probably jitsi dev team will help me with that.

It's been a while since I've started working on the project although official date was the 21st of May. Now the protocol is almost completed and I'll write about that next time.