Monthly Archives: February 2014

  • Interesting Threading Issue In .Net

    Yesterday I noticed a strange anomaly in the logs of an application I wrote and manage at work while investigating another issue. It manifested as us sending duplicate messages from a queue to a third party over and over.

    I looked into the code and since threading was involved, I figured there was some thread safety/shared state issue involved. I consulted with a couple of coworkers who didn’t immediately notice any issues, but I created some simple test cases to test my assumptions about threading with lambdas.

    Simple Example of the problem

    Below you will see me attempting to create 5 workers, do a small unit of work, and then write to the console. I would expect each worker to write out the Int32 it was created with. My assumption was that the C# compiler would, with the lambda expression, create a copy of i for the thread being created and use that copy. I was entirely wrong.

    As you can see, when the worker is created, on the initial thread, each Console.WriteLine has the right value of i, but when the thread is running, it contains the last value of i, 6 (for loop increments it after its last value value causing the loop to exit).

    var rand = new Random();
    var threads = new List<Thread>();
    
    for (int i = 1; i <= 5; i++)
    {
        Console.WriteLine("Creating worker {0}.", i);
        
        Thread t = new Thread(() =>
        {
            // Simulate work
            Thread.Sleep(rand.Next(500, 2000));
            
            Console.WriteLine("Finished running worker {0}.", i);
        });
        threads.Add(t);
    }
    
    threads.ForEach(t => t.Start());
    threads.ForEach(t => t.Join());
    
    /* Output:
    Creating worker 1.
    Creating worker 2.
    Creating worker 3.
    Creating worker 4.
    Creating worker 5.
    Finished running worker 6.
    Finished running worker 6.
    Finished running worker 6.
    Finished running worker 6.
    Finished running worker 6. */
    

    Simple Fix

    The C# compiler did not make a copy of the state, but we can do this directly and pass it in using ParameterizedThreadStart. This makes the list a collection of Int32/Thread pairs. Obviously, in our actual app, our state object is more complex than an Int32.

    var rand = new Random();
    var threads = new List<Tuple<int, Thread>>();
    
    for (int i = 1; i <= 5; i++)
    {
        Console.WriteLine("Creating worker {0}.", i);
        
        Thread t = new Thread(x =>
        {
            // Simulate work
            Thread.Sleep(rand.Next(500, 2000));
            
            Console.WriteLine("Finished running worker {0}.", x);
        });
        threads.Add(Tuple.Create(i, t));
    }
    
    threads.ForEach(tuple => tuple.Item2.Start(tuple.Item1));
    threads.ForEach(tuple => tuple.Item2.Join());
    
    /* Output:
    Creating worker 1.
    Creating worker 2.
    Creating worker 3.
    Creating worker 4.
    Creating worker 5.
    Finished running worker 1.
    Finished running worker 4.
    Finished running worker 2.
    Finished running worker 3.
    Finished running worker 5. */
    

    Better Fix

    That works, but it doesn’t really reflect the original intent. I receive X number of things to do, and in the real product, a Semaphore was used to control the maximum number of messages that were sent at a time.

    For instance, if I received 200 messages from the queue to send and I can send 50 messages at a time, I would spin up 200 threads which would wait on a semaphore, sending 50 maximum at a time. Obviously this is inefficient, and I don’t really have an excuse for why I did it this way when I converted it from a single-threaded process that could not keep up with demand to a multi-threaded process which ended up with this duplication problem. In retrospect, I would never have done this.

    The following has a queue of 15 work items which is serviced by 5 worker threads and represents close to how the code works now.

    Queue With Workers Serving It
    var rand = new Random();
    var threads = new List<Thread>();
    var queueLocker = new object();
    var queue = new Queue<int>();
    const short maxWorkers = 5;
    
    // Create dummy data for processing
    for (int job = 1; job <= 15; job++)
    {
        queue.Enqueue(job);
    }
    
    for (int i = 1; i <= maxWorkers; i++)
    {
        Console.WriteLine("Creating worker {0}.", i);
        
        Thread t = new Thread(() =>
        {
            int? job = null;
            
            // Try to get job from queue to handle
            while (queue.Count > 0)
            {
                lock (queueLocker)
                {
                    if (queue.Count > 0)
                    {
                        job = queue.Dequeue();
                    }
                }
                
                if (job.HasValue)
                {
                    // Simulate work
                    Thread.Sleep(rand.Next(500, 2000));
                    
                    Console.WriteLine(
                        "Worker {0} finished running job {1}.",
                        Thread.CurrentThread.Name,
                        job);
                }
            }
    
            Console.WriteLine(
                "Worker {0} has no more work. Exiting.",
                Thread.CurrentThread.Name);
        });
        t.Name = i.ToString();
        threads.Add(t);
    }
    
    threads.ForEach(t => t.Start());
    threads.ForEach(t => t.Join());
    
    /* Output:
    Creating worker 1.
    Creating worker 2.
    Creating worker 3.
    Creating worker 4.
    Creating worker 5.
    Worker 4 finished running job 4.
    Worker 5 finished running job 5.
    Worker 1 finished running job 1.
    Worker 2 finished running job 2.
    Worker 3 finished running job 3.
    Worker 1 finished running job 8.
    Worker 2 finished running job 9.
    Worker 4 finished running job 6.
    Worker 5 finished running job 7.
    Worker 3 finished running job 10.
    Worker 1 finished running job 11.
    Worker 1 has no more work. Exiting.
    Worker 2 finished running job 12.
    Worker 2 has no more work. Exiting.
    Worker 4 finished running job 13.
    Worker 4 has no more work. Exiting.
    Worker 5 finished running job 14.
    Worker 5 has no more work. Exiting.
    Worker 3 finished running job 15.
    Worker 3 has no more work. Exiting. */
    

    Even Better Fix I Can’t Use

    Unfortunately, this code is limited to .Net 3.5 right now, but this particular problem looks like a great match for the Task Parallel Library in .Net >= 4.0. It would offload all the thread handling to .Net which is particularly well-suited to this problem: running tasks in parallel.


  • How To Fix Verizon FiOS Problem Connecting To Websites

    Feb 26, 2014 Update:

    Verizon appears to have solved the problem on their end
    and an MTU of 1500 will work again. The original problem lasted from at least the beginning
    of Feb 24, 2014 until mid-day Feb 26, 2014.

    Verizon’s escalation team later called me back and provided the following information:

    New uplinks were installed in Lewisville and Plano which were improperly configured and the configuration has been corrected.

    The information below remains for historical information on the problem and a workaround.

    Verizon Reports the Outage is Resolved
    Verizon Reports the Outage is Resolved

    Background on the Problem

    I live in the Dallas market for Verizon FiOS which is where it seems the problems are happening.
    The issue manifests primarily for me as an inability to play content reliably on YouTube,
    but I had many other issues on other CDNs. Trying to view other sites, such as MitchRibar.com, would never succeed for some other FiOS subscribers and me.

    The error displayed in Chrome was ERR_CONNECTION_RESET.

    List of sites affected

    The following sites were reported affected by friends, others on the Internet, and me:

    Diagnosis

    I called Verizon multiple times and they were no help of course. Since a ping worked and since a traceroute exited their network they said that the problem was either on my side or YouTube’s side and didn’t care that it affected multiple sites. I tried to explain the difference between ICMP and a TCP session but they aren’t that smart, of course. They wouldn’t even talk to me until I plugged in their router which is a terrible piece of equipment I haven’t used for years. I obliged and that’s when I got the above from them. They would not let me talk to a higher tier of support.

    However, after finding out some friends had the same issue, I was tipped off to this forum post by started by another Verizon customer. You’ll see most if not all in the thread are from the Dallas market.

    I did some ping tests to validate that the problem is the MTU setting. Somewhere in Verizon’s network, close to the DFW side of the route, someone has messed up the MTU and reduced it from the default of 1500 to 1496. Keep in mind that there is a 28 byte header so the successful (non-fragmented) ping size + 28 = MTU.

    In the below paste, -f prevents fragmentation of the packet and -l 1472/1468 sets the ping packet length. Keep in mind that the IP header adds 28 bytes and also ping parameters are different on different platforms. This example is from Windows, but check the parameters for your platform to set these options.

    >ping mitchribar.com -f -l 1472
    
    Pinging mitchribar.com [205.134.224.227] with 1472 bytes of data:
    Request timed out.
    
    Ping statistics for 205.134.224.227:
        Packets: Sent = 1, Received = 0, Lost = 1 (100% loss),
    Control-C
    ^C
    
    >ping mitchribar.com -f -l 1468
    
    Pinging mitchribar.com [205.134.224.227] with 1468 bytes of data:
    Reply from 205.134.224.227: bytes=1468 time=42ms TTL=55
    

    As you can see, 1472 (which equates to a 1500 MTU [1472+28=1500]) did not work. Lowering it until I got to 1468 worked, which equates to an MTU of 1496, so you can see, because of Verizon’s now-broken network, we must lower the MTU from the default of 1500 to 1496 to ensure the packets traverse the network correctly.

    I don’t use Verizon’s router (I use DD-WRT and changing the MTU is easy: Setup -> MTU) but Jake Smith provided these screenshots to me that I edited to show the steps. They come from a regular Verizon FiOS router.

    Steps to Change the MTU

    You should not need to change router settings unless this problem has happened again after Feb 26, 2014. Please see the note at the top of the page.

    The information below remains for historical information on the problem and a workaround.

    Connect to your router using a web browser

    There are many resources available to find out what your router’s LAN IP address is. Connect to this address in your router. Your FiOS router has the default password printed on it if you have not changed it.

    My router’s IP is 192.168.0.1, but yours may (and probably will) be different. With my example, I would navigate to http://192.168.0.1 in my web browser.

    Click My Network

    Step 1: Click My Network

    Click Network Connections

    Step 2: Click Network Connections

    Click Broadband Connection or the Pencil Icon to Edit

    Step 3: Click Broadband Connection

    Click Configure Connection

    Step 4: Click Configure Connection

    Change MTU to Manual and 1496

    Step 5: Change MTU