The Traveling Developer

The thoughts of a traveling computer scientist.

Are infinite loops really necessary?

leave a comment »

I came across a question on StackOverflow about whether using infinite while loops is a bad programming practice.

while (true) {
// do something
if (condition) break;
}

NOTE: This also includes do-while.

Most of the answers had compelling reasons for and against it. Tanktalus had an answer that got me thinking:

“END WHILE” is just a nasty goto. “IF” is just a nasty goto. You can say the same about any flow-control construct. That means that we just need to have a better definition of what is “nasty”. The general rule is that loop constructs and subroutine exits are allowed to go to an earlier point in execution, all others must go to a later point in execution. Since break goes to a later point in execution, we’re ok there.

The real rule is something more along the lines of how readable it is. How easy is it for the maintenance guy to follow the flow of execution? If it’s easy, problem solved. If not, then we have a bad construct.

This is pretty subjective, but I’d base my decision on:

  1. how long is the loop?
  2. how close is the break to the beginning or end of the loop?
  3. how easy is it to reorder things to move the break (or eliminate it altogether) without repeating code? Moving code to another subroutine may be an option, but can play havoc with scoping, or even just having things close together that are conceptually close together.

In a nutshell, who cares how the code is written just as long as it is readable. Taking short cuts and having obfuscated code might get the job done in record time but at some point someone is going to have to maintain it and most of the time you will not be there to explain your thought process.

The while (true) statement has been in my head for a while now. When I find myself writing a while (true) I stop for a minute and just think about why I am doing this. Nine times out of ten I can come up with a more effective way of writing it without the true.

While loops are used when we have no idea how many iterations a task will take to complete. The boolean condition that breaks the loop tells us what we are trying to do. With an infinity sign in its place how will we know the purpose of the loop? We have to dig into the braces to find the break condition. And what if there are several break conditions? I am not saying that break statements should not be used. A break should indicate an error or something unexpected that would cause an early stop to the loop.

This is not to say that there is no place for infinite loops. In some circumstances, such as threads, an infinite loop may be unavoidable. There are always good reasons to write code a certain way. I am simply saying that taking an extra minute to write more maintainable code can make someones life a little easier. An infinite loop is a lazy loop.

Written by Collin Price

March 1, 2010 at 2:34 pm

Posted in best-practices

Tagged with , , , ,

The Internet

with one comment

No single networking technology is best suited for all needs. Since each type of network uses its own signal and data encoding, packet format, and addressing it becomes difficult to send data between these networks. A technology is required to bridge the gap and create what appears to be a homogeneous network when in reality the network is completely heterogeneous. This technology is known as the internetwork or simply put, the internet.

The internet is made up of special purpose network hardware called internet routers or internet gateways. These special purpose computer systems are designed to work with all types of LAN and WAN technologies. Internet routers act at the interconnects between different networks. The internet can be conceptualized as a cloud:

The Internet in the shape of a cloud.A virtual network is essentially created. With this we will need an addressing and naming scheme to communicate across the new network. This protocol is TCP/IP.

Written by Collin Price

February 21, 2010 at 2:51 pm

Posted in computer networks

Tagged with ,

Protocols and Layers

leave a comment »

Sending data across a network is easy. The hard part is organizing it. There needs to be transmission protocols in place to make sure each side knows how to send and receive data. Protocols dictate the format, meaning, rules and rules for handling problems. Without protocols problems like corrupted bits, packet lost, or packet duplication can occur.

Sets of protocols work together to solve parts of communication problems. These sets are referred to as Protocol suites. A suite is divided up into layers. Each layer is used to solve a sub-problem. An early example of a network protocol is the ISO 7-layer model. Below is each layer and an example protocol that operates on the layer:

  1. Physical – RS-232
  2. Data Link – Ethernet
  3. Network – IP
  4. Transport – TCP
  5. Session – NetBIOS
  6. Presentation – MIME
  7. Application – HTTP

Protocol software follows the same layered model. Each layer has one software module. Incoming and outgoing data passes through each module. At each software layer the packet header is modified. The layer header is appended to the front of the packet for outgoing packets and the layer header is removed and processed for incoming packets.

Layer Headers

It is expected that layer N as the destination should receive the same data that was sent by layer N at the source. Common practices like parity and CRC are used for bit corruption. Sequence numbers are used for out-of-order delivery and packet duplication.

In a perfect world all computers would be upgraded at the same time and we would not have to worry about compatibility issues. A problem when networks have different hardware is that some computers will send packets faster than others. A receiver could become overwhelmed with traffic. Flow control must be put in place to regulate the amount of traffic to a computer. Flow control is the managing of the rate of data transmission between two nodes. The receiver controls how fast it receives data. Two forms of flow control exist: stop-and-go and sliding window.

Stop-and-Go:

The stop-and-go method is the simplest. The sending side transmits one packet and waits for an acknowledgment signal from the receiver. The receiving side receives and consumes a packet and then transmits the acknowledgment signal. This method is slow and inefficient. It should only be used in special cases.

Sliding Window:

The receiving side sets up multiple buffers to store incoming packets. It sends a signal to the sender telling it how much room it has. The sender transmits as many packets as the receiver can carry. As the receiver receives packets it send a signal to the sender that it can send another packet. This method allows for simultaneous packet transmissions. It is fast and efficient.

Now that our data is organized and our machines have flow control the fundamental network problem of congestion arises. Network congestion is analogous to a highway traffic jam. A computer or node has filled its buffers and packets begin to be discarded. The nodes have no other choice in the matter. The sending computer will likely retransmit the packet after a certain time creating more traffic and continuing the problem. There are two solutions to this problem. The buffer space could be increased to deal with the traffic but we all know that if the space increases so will the traffic. Thus this is only a temporary fix. The sender should keep track of its packet loss and use this as an indicator that there is a problem and stop sending.

Written by Collin Price

February 20, 2010 at 10:19 pm

Asynchronous Transfer Mode

leave a comment »

Asynchronous Transfer Mode (ATM) is a standard digital data transmission protocol. It was originally developed in the mid 1980s by the phone companies as a replacement for the Internet. ATM is meant to handle voice, video and data. It has a connection-oriented interface. A connection needs to be established in order to send data. When no more data needs to be sent the connection needs to be closed. The ATM packet is of fixed size. Its size was chosen as a compromise between voice size and data size (5 octet header and 48 octet payload).

The key to ATM’s speed is its switching system. It uses cell forwarding to send incoming packets to their outgoing interface via hardware. Each ATM connection is identified by a 24-bit binary value. This is known as Virtual Path Identifier/Virtual Channel Identifier (VPI/VCI) but it’s generally called a label. The VPI/VCI is rewritten at each switch.

All ATM connections define a certain quality of service. When a connection is established the endpoint specifies the type of data transfer, throughput desired, maximum packet burst size, and maximum delay tolerated. This makes ATM ideal for environments with low latency and very high quality of service such as audio and video streams.

Types of Data Transfer:

  • Constant Bit Rate (CBR)
    • audio
  • Variable Bit Rate (VBR)
    • video with adaptive encoding
  • Available Bit Rate (ABR)
    • data
  • Unspecified Bit Rate (UBR)

The asynchronous transfer mode failed to deliver as promised. The switches were too expensive for LAN and the quality of service ended up being impossible to implement. With the increase in live video and audio streams interest in ATM has grown.

Written by Collin Price

February 19, 2010 at 10:49 am

Packet Routing

leave a comment »

Routing tables are essential for a good network. They decide where to send each packet in an effective way. Routing tables can either be set manually for automatically. Routing tables that are set manually are usually used in small networks where the number of computers on the LAN does not change. In larger networks, such as a business or university, manual routing tables would be impossible to maintain. Automatic routing is used in these large networks. The tables are created and updated with software. The tables are recreated when a failure to send occurs. A failure occurs when a computer no longer exists on the network or a new computer is added.

Graph theory can be applied to computer networks. A switch represents a node and a connection represents an edge. The cost of an edge can be represented with time in milliseconds, traffic load, or even physical distance but for most cases the cost of an edge is one.

In a switched network there is no central authority to tell each routing table  how to get to all of the other switches. Therefore each switch must learn the route to each destination. The shortest path to each switch needs to be calculated. There are two algorithms for computing shortest paths: distance vector (DV) and link-state. Both are used in practice.

Distance Vector:

Switches exchange information in their routing tables with their neighbours. Each switch sends a list of pairs where each pair gives (destination, distance). The receiving switch compares its list of routes with the new list. If it find a better route it will update it.

The distance vector algorithm can run into a bad information propagation problem. Sometimes if a link is lost a routing loop can occur:

In the simplest version, a routing loop of size two, node A thinks that the path to some destination (call it C) is through its neighbouring node, node B. At the same time, node B thinks that the path to C starts at node A.Thus, whenever traffic for C arrives at either A or B, it will loop endlessly between A and B, unless some mechanism exists to prevent that behaviour.

Link-state overcomes the instabilities in the distance vector algorithm.

Link-State:

In link-state each switch periodically sends out a message to its neighbours telling them which switches are connected to them. If any of these are changed the switch computes new routes. Dijkstra’s algorithm is used to calculate new routes.

Written by Collin Price

February 19, 2010 at 10:43 am

Packet Switching

leave a comment »

From Wikipedia,

A packet-switched network is a digital communications network that groups all transmitted data, irrespective of content, type, or structure into suitably-sized blocks, called packets.

Efficient long distance communication networks, such as the world-wide web, are connected through a series of packet switches. Each of these connection points can be considered a “node”.

Interconnect switch network.

Each packed switch performs the same routine after receiving a packet. The process is known as “Store and Forward”. First the packets get stored in memory, the switch examines the destination address and is forwarded toward its destination. A packets destination address is consists of two parts: packet switch number and computer on that switch. The address is encoded as an integer with its higher-order bits for the switch number and low-order bits used for the computer number. For example, address(1,2) means that the packet is destined for switch 1 on computer 2.

A packet switch has a routing table to decide where to forward each packet. A routing table keeps track of all the paths of the immediate network around it.

Destination      Next Hop
(1,anything)    Interface 1
(2, anything)   local computer

The switch with this table knows that a packet addressed to (1,4) would be forwarded to Interface 1 and that a packet address to (2,3) came from the local network and will be forwarded to computer 4.

Written by Collin Price

February 18, 2010 at 5:59 pm

Computer Networks – Extending Your Network

leave a comment »

As your local area network becomes larger certain hardware limitations can seem. The signal quality begins to degrade as the cable lengths become longer. There needs to be a way to connect two or more network segments that exist across a university campus or business.

A repeater can be used to amplify a signal. A repeater connects two LAN segments. It receives the signal from one end and transmits an amplified version of the signal out the other end. Unfortunately noise and collisions are propagates across the device. The device itself is dumb. It does not recognize frames, it only reads signals. Repeaters are limited to five segments because the signal becomes to degraded to read properly.

Repeaters are effective for two LAN segments. If you have multiple segments that need to be connected a hub should be used. A hub connects multiple Ethernet devices together to create a single network segment. Each devices signal is propagated to all connections. It also does not understand frames.

Hubs and repeaters are low cost devices used for creating simple networks. Eventually the network will slow down and become cluttered with every single computers traffic. A bridge is similar to a repeater in that it connects two LAN segments although it is more efficient. Bridges understand frames. It knows where a frame came from and where it needs to go. When a bridge is added to a network it begins to build a list of all computers on each side of the bridge by storing the frames source  address. Once it learns that a computer is on one side of the segment it will not forward a frame across the bridge if it came from the same side. For example:

A bridge is connected to segment A and segment B. After reading some frames the bridge has learned that segment A contains computers 1,2 and 3, and that segment B contains computers 4,5 and 6. If the bridge encounters a frame that came from computer 1 and has a destination address of computer 3 it will not cross the bridge. If a frame with  a source address of 1 and a destination address of 6 came it would be let across.

As new source addresses are encountered the bridge adds them to the list. New computers can easily be connected to the network without any configuration to the bridge.

Using a hub is an easy way to have all the computers in a network talk to each other and a bridge allows for simple network extensions. A network switch is the best of both worlds. It looks like a hub but performs like a bridge. It connects multiple computers together and understands their network traffic. One key feature of a network switch is that it allows for simultaneous data transfers in some situations. If a switch is connected to four computers A/B/C/D, then A and B can transfer data between each other at the same time C and D are transferring data. A switch creates micro-segments to allow for full bandwidth between connections. Most modern LAN’s are completely run on network switches. Linnaeus University, the university I am currently attending, runs on a completely switched network.

Written by Collin Price

February 16, 2010 at 10:42 pm

Follow

Get every new post delivered to your Inbox.