Computer Networking: Application layer

今日长缨在手,何时缚住苍龙。这篇文章主要讲述Computer Networking: Application layer相关的知识,希望能为你提供帮助。
Principle of network applications
Network application architectures
Application architecture: (different from the network architecture) Designed by the application developer, dictates how the application is structured over the various end system.
Two predominant architectural paradigms: a) the client-server architecture b) the peer-to-peer (P2P) architecture.
 
Client-server architecture: There is an always-on host (server), which services requested from many other sometimes-on or always-on hosts (clients).
Clients do not directly communicate with each other;
The server has a fixed-well known address (IP address).
e.g. the Web, FTP, Telnet, email.
server farm: a cluster of hosts used to create a powerful virtual server in client-server architectures.
Infrastructure intensive: Application services that are based on the client-server architecture.
 
P2P architecture: The application exploits direct communication between pairs of peers.
Peers communicate without passing through a dedicated server.
Peers: Intermittently connected hosts, not owned by the service provider, but are devices controlled by users.
e.g. BitTorrent, eMule, Skype, PPLive.
P2P architectures have Self-scalability.
 
Some application have hybrid architectures, combining both client-serer and P2P elements.
e.g. Instant messaging applications.
 
Processes Communicating
Process: a program that is running within an end system.
When processes are running on the same end system, they communicate with each other with interprocess communication;
When processes are running on different end systems, they communicate with each other by exchanging messages across the computer network.
 
How does a process indicate which process it wants to communicate with?
To identify the receiving process, 2 pieces of information need to be specified:
a. The name or address of the host. [IP address]
b. An identifier that specifies the receiving process in the destination host. [A destination port number]
 
 
Client and Server processes
A network application consists of pairs of processes that send messages to each other over a network, we label one of the two processes as the client and the other process as the server.
(Even if in some applications like P2P file sharing, a process can be both a client and a server, we can still label one process as the client and the other process as the server.)
In the context of a communication session between a pair of processes:
The process that initiates the communication is labeled as the client;
The process that waits to be contacted to begin the session is the server.
 
Socket
Socket: The interface between the application layer and the transport layer within a host, also referred to as the Application Programming Interface (API) between the application and the network.
A process sends messages into and receive message from, the network through a socket.
Analogy: a door
The application developer has control of everything on the application-layer side of the socket;
The application developer has little control of the transport-layer side of the socket, only includes:
a) The choice of transport protocol; b) The ability to fix a few transport-layer parameters.
 
 
Transport-layered protocols for applications
Transport services classification
The services that a transport-layer protocol can offer to applications invoking it can be broadly classify along 4 dimensions: reliable data transfer, throughput, timing, and security.
 
Reliable data transfer
If a protocol provides a guaranteed data delivery service – guarantee that the data sent by one end of the application is delivered correctly and completely to the other end of the application, it is said to provide reliable data transfer.
A transport-layer protocol that doesn’t provide reliable data transfer may be acceptable for loss-tolerant applications. (e.g. multimedia applications like real-time audio/video)
 
Throughput
A transport-layer protocol can provide a guaranteed available throughput at some specified rate, with such a service, the application could request a guaranteed throughput of r bits/sec, and the transport protocol would then ensure that the available throughput is always at least r bits/sec.
Bandwidth-sensitive applications: Applications that have throughput requirements.
Elastic applications: Applications that can make use of as much, or as little, throughput as happens to be available.
 
Timing
Timing guarantees can come in many shapes and forms.
e.g. guarantees that  every bit that the sender pump into the socket arrives at the receiver’s socket no more than 100 msec later.
(Appealing to interactive real-time applications.)
 
Security
A transport protocol can provide an application with one or more security services, providing confidentiality, data integrity or end-point authentication between the 2 processes.
 
 
Transport services provided by the internet
The Internet makes 2 transport protocols available to applications, UDP and TCP.

Computer Networking: Application layer

文章图片

Services provided by TCP protocol
When an application invokes TCP as its transport protocol, the application receives both a connection-oriented service and a reliable data transfer service from TCP.
1) Connection-oriented service: TCP has the client and server exchange transport layer control information with each other (i.e. handshaking procedure) before the application-level message begin to flow. After the handshaking phase, a TCP connection exists between the sockets of the two processes.
The connection is a full-duplex connection. (i.e. The two processes can send message to each other over the connection at the same time.)
When the application finishes sending message, it must tear down the connection.
2) Reliable data transfer service: The communicating processes can rely on TCP to deliver all data sent without error and in the proper order.
TCP also includes a congestion-control mechanism service that throttles a sending process when the network is congested between sender and receiver, and attempts to limit each TCP connection to its fair share of network bandwidth.
 
Services provided by UDP protocol
UDP is a no-frill, lightweight transport protocol, providing minimal services.
UDP is connectionless (no handshaking), provides an unreliable data transfer service, does not include a congestion-control mechanism.
 
Services not provided by Internet transport protocols
Today’s Internet can often provide satisfactory service to time-sensitive applications, but it cannot provide any timing or bandwidth (throughput) guarantees.
 
 
Application-layer protocols
An application-layer protocol defines:
1) The types of messages exchanged;
2) The syntax of the various message types;
3) The semantics of the fields in the message;
4) Rules for determining when and how a process sends messages and responds to messages.
Some application-layer protocols are available in the public domain (specified in RFCs), e.g. HTTP;
Many other application-layer protocols are proprietary and intentionally not available in the public domain.
An application-layer protocols (e.g. HTTP, SMTP) is only one piece of the network application that using it (e.g. Web, e-mail).
 
1. The Web & HTTP
Network application: Web
Application-layered protocol: HTTP
HyperText Transfer Protocol (HTTP): The Web’s application-layer protocol, defines the structure of HTTP messages and how the client and server exchange the messages.
When a user requests a Web page, the browser sends HTTP request messages fro the object in the page to the server. The server receives the requests and responds with HTTP response messages that contain the objects.
HTTP is implemented in a client program and a server program executing on different end systems that talk to each other by exchanging HTTP messages.
Web browsers implement the client side of HTTP;
Web servers implement the server side of HTTP, house Web objects, each addressable by a URL.
Object: An object is simply a file tat is addressable by a single URL. (e.g. html file, JPEG image, java applet, video clip.)
Web page: A Web page consists of objects.
Most Web pages consists of a base HTML file and several referenced objects. The base HTML file reference the other objects in the page with the objects’ URLs.
URL: consists of the hostname of the server that houses the object and the object’s path name.
HTTP uses TCP as its underlying transport protocol.
The HTTP client first initiates a TCP connection with the server. Once the connection is established, the browser and the server processes access TCP through their socket interfaces.
The client sends HTTP request messages into its socket interface and receives HTTP response messages from its socket interface;
The server receives HTTP request messages from its socket interfaces and sends HTTP response messages into its socket interface.
HTTP is a stateless protocol, an HTTP server maintains no information about the clients.
 
HTTP can use both non-persistent connections and persistent connections (default).
Non-persistent connections: If each request/response pair is sent over a separate TCP connection, the application uses non-persistent connections. Each TCP connection transports exactly one request messages and one response messages, and is closed after the server sends the object.
Persistent connections: Else, all of the requests and their corresponding responses are sent over the same TCP connection, the application uses persistent connections. The server leaves the TCP connection open after sending a response, subsequent requests and responses between the same client and server can be sent over that connection.
Round-trip time (RTT): The time for a small packet to travel from client to server and then back to the client.
Computer Networking: Application layer

文章图片

Time needed to request and receive an object   = 2*RTT + the transmission time at the server (non-persistent)
 
There are 2 types of HTTP messages, request messages and response messages.
HTTP request message:
Computer Networking: Application layer

文章图片

 
e.g.
GET /somedir/page.html HTTP/1.1
Host: www.someschool.edu
Connection: close
User-agent: Mozilla/4.0
Accept-language: fr
Request line: The first line of an HTTP request messages, consists of the method field, the URL filed and the HTTP request field.
Header lines: The subsequent lines after the request line.
The method field can take on values including GET, POST, HEAD, PUT and DELETE.
The GET method is used when the browser requests an object with the requested object identified in the URL field;
The POST method is used when the users fills out a form, use POST and the entity body contains what the user entered into the form field (empty if using GET);
(The GET method can also be used to generate a request with a form by including the inputted data in the requested URL.)
The HEAD method is often used by application developers for debugging. When a server receives a request with the HEAD method, it responds with an HTTP message but leaves out the requested object;
The PUT method allows a user or an application to upload an object on a Web server;
The DELETE method allows a user or an application to delete an object on a Web server.
 
HTTP response message
Computer Networking: Application layer

文章图片

 
e.g.
HTTP/1.1 200 OK
Connection: close
Date: Thu, 07 Jul 2007 12:00:15 GMT
Server: Apache/1.3.0 (Unix)
Last-Modified: Sun, 6 May 2007 09:23:24 GMT
Content_Length: 6821
Content_Type: text/html
 
(data data data data data …)
Status line consists of the protocol version field, a status code and a corresponding status message.
Some  common status codes and associated phrases includes:
200 OK; 301 Moved Permanently; 400 Bad Request; 404 Not Found; 505 HTTP Version Not Supported.
 
Cookies allow server to keep track of users, help to create a user session layer on top of stateless HTTP.
Cookies technology have 4 components:
1) A cookie header line in the HTTP response message; (e.g. Set-cookie: 1678)
2) A cookie header line in the HTTP request message; (e.g. Cookie: 1678)
【Computer Networking: Application layer】3) A cookie file kept on the user’s end system and managed by the user’s browser;
4) A back-end database at the Web site.
Computer Networking: Application layer

文章图片

Each time the client requests a Web page from the server, the browser consults cookie file, extract the identification number for this site, and puts a cookie header line that includes the identification number in the HTTP request.
In this manner, the server is able to track the user’s activity at that site.
 
Web cache / proxy server: A network entity that satisfies HTTP requests on the behalf of an origin Web serer.
A web cache has its own disk storage and keeps copies of recently requested objects in this storage. Once a browser is configured, each browser request for an object is first directed to the Web cache.
Computer Networking: Application layer

文章图片

Process:
1) The browser establishes a TCP connection to the Web cache and sends an HTTP request for the object to the Web cache;
2) The Web cache checks if it has a copy of the object stored locally. If it does, the Web cache returns the object within an HTTP response message to the client browser;
3) If the Web cache does not have the object, the Web cache opens a TCP connection to the origin server, sends an HTTP request for the object into the cache-to-server TCP connection.
After receiving this request, the origin server sends the object within an HTTP response to the Web cache;
4) When the Web cache receives the object, it stores a copy in its local storage, and sends a copy within an HTTP response message to the client browser (over the existing TCP connection between the client browser and the Web cache).
Typically a Web cache is purchased and installed by an ISP. Benefits:
a. A Web cache can substantially reduce the response time for a client request, particularly if the bottleneck bandwidth between the client and the origin server is much less than the bottleneck bandwidth between the client and the cache;
b. Web caches substantially reduce traffic on an institution’s access link to the Internet. (Thus the institution does not have to upgrade bandwidth as quickly, thereby reducing costs.)
Computer Networking: Application layer

文章图片

 
 
For Web caching, because the object housed in the Web server may have been modified since the copy was cached, the copy residing in the cache may be stale.
The conditional GET: A mechanism that allows a cache to verify that its objects are up to date.
Conditional GET message: An HTTP request message that 1) uses the GET method and 2) includes an If-Modified-Since: header line, telling the server to send the object only if the object has been modified since the specified date.
Process:
1. On behalf of a requesting browser, a proxy cache sends a request message to a Web server:
Cache [requests]→ Web server
GET /fruit/kiwi.gif HTTP/1.1
Host: www.exotiquecuisine.com
2. The Web server sends a response message with the requested object to the cache:
Web server [response]→ Web server
HTTP/1.1 200 OK
Date: Thu, 7 Jul 2007 15:39:29
Server: Apache/1.3.0 (Unix)
Last-Modified: Mon, 4 Jul 2007 09:23:24
Content-Type: image/gif
 
(data data data data data …)
3. The cache forwards the object to the requesting browser. Also, caches the object locally and stores the last-modified date along with the object.
4. (Later, another browser requests the same object via the cache.) The cache performs an up-to-date check by issuing a conditional GET:
Cache [requests with conditional GET]→ Web server
GET /fruit/kiwi.gif HTTP/1.1
Host: www.exotiquecuisine.com
If-modified-since: Wed, 4 Jul 2007 09:23:24
5. The server sends a response message to the cache. If not modified since that date, the response message does not include the requested object:
Web server [response (not include requested object)]→ cache
HTTP/1.1 304 Not Modified
Date: Thu, 14 Jul 2007 15:39:29
Server: Apache/1.3.0 (Unix)
6. If the response message that cache receives has 304 NOT Modified in the status line, the cache can go ahead and forward its cached copy of the object to the requesting browser.
 
 
2. File transfer & FTP
Network application: File transfer
Application-layered protocol: FTP
Computer Networking: Application layer

文章图片

 
Computer Networking: Application layer

文章图片

The differences between HTTP and FTP:
1) FTP sends its control information out-of-band, while HTTP sends its control information in-band.
FTP uses 2 parallel TCP connection, a control connection for sending control information (e.g. user identification, password, commands to change remote directory, commands to ‘put’ and ‘get’ files.) between the two hosts, and a data connection for sending a file;
HTTP sends request and response header line into the same TCP connection that carries the transferred file itself.
2) Throughout a session, the FTP server must maintain state about the user, while HTTP is stateless.
The FTP server must associate the control connection with a specific user account, and keep track of the user’ current directory;
HTTP does not have to keep track of any user state.
Process:
1. By providing the hostname of the server (a remote host), The client (a user) causes the FTP client process in the local host to establish an FTP session with the server, initialing a control TCP connection with the server on server port number 21;
2. The client sends the user identification, password over the control connection; (access the remote account)
3. The client wanders about the remote directory tree, sending commands to change remote directory over the control connection;
4. The client sends a command for a file transfer (either to or from the server) over the control connection;
5. The server receives the command and initiates a TCP data connection to the client;
(If, during the same session, the client wants to transfer another file, FTP opens another data connection.)
The data connections of FTP are non-persistent. The FTP control connection remains open throughout the duration of the user session, but a new data connection is created for each file transferred within a session.
 
The FTP commands and replies are sent across the control connection in 7-bit ASCII format. Each command is followed by a reply.
FTP command: From client to server, consists of 4 uppercase ASCII characters, some with optional arguments. A carriage return and line feed end each command.
e.g.
·USER username: to send the user identification to the server;
·PASS password: to send the user password to the server;
·LIST: to ask the server to send back a list of all the files in the current remote directory. (The server will send the list over a new data connection.)
·RETR filename: to retrieve a file from the current directory of the server. (The server will send the requested file over a new data connection.)
·STOR filename: to store a file into the current directory of the server.
FTP reply: From server to client, three-digit numbers with an optional message following the number.
e.g.
·331 Username OK, password required
·125 Data connection already open; transfer starting
·425 Can’t open data connection
·452 Error writing file
 
 
3. Electronic mail
Network application: Electronic mail
Application-layered protocol: SMTP(principle).

Computer Networking: Application layer

文章图片

Each recipient has a mailbox located in a mail server, managing and maintaining the messages that have been sent to the recipient.
Computer Networking: Application layer

文章图片

Process:
1. The sender’s user agent pushes the message to his/her mail server, placing the message in the its outgoing message queue; [The sender’s user agent → the sender’s mail server, SMTP]
2. The sender’s mail server relays the message to the recipient’s server, where it is deposited in the recipient’s mailbox. [The sender’s mail server → The recipient ‘s mail server, SMTP]
If the sender’s server cannot deliver mail to the recipient’s server, the sender’s server holds the message in a message queue and attempts to transfer the message later. If there is no success after several days, the server removes the message and notifies the sender with an e-mail message.
3. When the recipient wants to access the message in his/her mailbox, the mail server containing his/her mailbox authenticates the recipient. The recipient’s user agent retrieves the message from his/her mailbox in his/her mail server. [The recipient’s mail server → The recipient’s user agent, mail access protocols including POP3, IMAP, HTTP]
 
Simple Mail Transfer Protocol (SMTP)
SMTP has a client side that executes on the sender’s mail server, and a server side that executes on the recipient’s mail server. Both the client and server side of SMTP run on every mail server.
 
How SMTP transfers a message:
1. The client side of SMTP has TCP establish a connection to port 25 at the server side of SMTP.
(If the server is down, the client tries again later.)
2. The server and client perform some initial SMTP handshaking, introducing themselves to each other.
3. The client sends the message into the TCP connection, and the server receives the message.
(The client repeats this process if it has other message to send to the server, else instructs TCP to close the connection.) – SMTP uses persistent connection.
 
The differences between SMTP and HTTP
1) SMTP is a push protocol, HTTP is a pull protocol.
In SMTP, the TCP connection is initiated by the machine that wants to send the file;
In HTTP, the TCP connection is initiated by the machine that wants to receive the file.
2) SMTP requires each message to be in 7-bit ASCII format.
In SMTP, if the message contains characters that are not 7-bit ASCII or contains binary data, the message has to be encoded into 7-bit ASCII.
HTTP data does not impose this restriction.
3) How a document consisting of text, images or other media types is handled.
HTTP encapsulates each object in its HTTP response message.
SMTP places all of the message’s object into one message.
 
An email message consists of a header and a message body (in ASCII), separated by a blank line (by CRLF).
Header: a series of header lines, containing peripheral information. Each header line contains readable text, consisting of a keyword followed by a colon followed by a value.
Every header must have a From: header line and a To: header line; Some header lines are optional.
“From:” or “to:”header lines are different from the SMTP commands “FROM” “TO”. the header lines are part of the mail message, the commands are part of the SMTP handshaking protocol.
 
The multipurpose Internet Mail Extensions (MIME) defines extra headers that allow a user agent to send content other than ASCII text.
The Content—Transfer-Encoding: header alerts the receiving user agent that the message body has been ASCII-encoded and indicates the type of encoding used. (When a user agent receives a message, it uses this header to convert the message body back to its original non-ASCII form.)
The Content-Type: header allows the receiving user agent to take an appropriate action on the message. (After converting the message body back to original form, uses this header to determine what action to take on the message body.)
 
Upon receiving a message, the SMTP receiving server appends a Received: header  line to the message, specifying the name of the SMTP server that sent the message (from), the name of the SMTP server that received the message (by), and the time at which the receiving server received the message.
Sometimes a single message may has multiple Received: header line and a more complex Return-Path: header line, because a message may be forwarded to more than one SMTP server in the path between sender and recipient.
e.g.
Received: from crepes.fr by hamburger.edu; 12 Oct 98
15:27:39 GMT
From: alice@crepes.fr
To: bob@hamburger.edu
Subject: Picture of yummy crepe.
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Type: image/jpeg
 
(base 64 encoded data …… base 64 encoded data)
 
 
Why the recipient’s user agent can’t use SMTP to obtain the message?
Because obtaining the message is a pull operation, whereas SMTP is a push protocol.
Mail access protocols transfers messages from recipient’s mail server to his/her local PC.
e.g. Post Office Protocol – Version 3 (POP3), Internet Mail Access Protocol (IMAP), HTTP.
 
POP3 protocol
POP3 begins when the client (user agent) establishes a TCP connection to the server (mail server) on port 110.
1) Authorization phases: The client sends a username and a password to authenticate the user
2) Transaction phase: The client can retrieve messages, mark messages for deletion, remove deletion marks, and obtain mail statistics.
The client can be configured to “download and delete” or to “download and keep”.
3) Update phase: Occurs after the client has issued the quit command, ending the POP3 session, and the server deletes the messages marked for deletion.
During a POP3 session, the server maintains state information (to keep track of which user message have been marked deleted), but does not carry state information across sessions.
 
IMAP protocol
IMAP is more complex than POP3.
An IMAP server associates each message with a folder. The IMAP protocol provides commands to allow users to create folders, move messages from one folder to another, search remote folders for messages matching specific criteria.
IMAP provides commands to allow users to obtain components of messages.
 
Web-based E-mail
With Web-based e-mail service, the user agent is an Web browser, the user communicate with its remote mailbox via HTTP.
The sender’s browser – (HTTP) → The sender’s mail server – (SMTP) → The recipient’s mail server – (HTTP) → The recipient’s browser.
 
 
DNS
The Internet’s domain name system (DNS) is 1) a distributed database implemented in a hierarchy of DNS servers and 2) an application-layer protocol (runs over UDP, uses port 53) that allows hosts to query the distributed database.
DNS services:
1.Translating hostnames to IP address;
DNS is commonly employed by other application-layer protocols (HTTP, SMTP, FTP ..) to translate hostnames to IP address.  
2. Host aliasing;
An application can invoke DNS to obtain the canonical hostname for a supplied alias hostname (typically more mnemonic) as well as the IP address of the host.
3. Mail server aliasing
A mail application can invoke DNS to obtain the canonical hostname for a supplied alias hostname as well as the IP address of the host.
4. Load distribution
For replicate Web servers, a set of IP address is associated with one canonical host name.
When client makes a DNS query for a name mapped to a set of addresses, the server responds with the entire set of IP addresses, but rotates the ordering of the addresses within each reply, distributing the traffic among the replicated servers.
DNS rotation is also used for e-mail.
 
How DNS Works in hostname-to-IP-address translation
1) An application invokes the client side of DNS application running on the user’s host;
2) The DNS client sends a DNS query message containing the hostname to a DNS server in the network;
3) The DNS client receives a DNS reply message including the desired mapping.
4) The invoking application receives the mapping from the DNS client, so it can initiate a TCP connection to its server process.
(All DNS query and reply messages are sent within UDP datagrams to port 53.)
 
 
The DNS uses a large number of servers, organized in a hierarchical fashion and distributed around the world.
Computer Networking: Application layer

文章图片

3 classes of DNS servers in the hierarchy:
1) Root DNS servers
2) Top-level domain (TLD) servers: are responsible for top-level domains (e.g. com, org, net, edu, gov) and all of the country top-level domains (e.g. uk, fr, ca, jp).
3) Authoritative DNS servers: Every organization with publicly accessible hosts on the Internet must provide publicly accessible DNS records that map the name of those hosts to IP addresses. An organization’s authoritative DNS server houses (originates) these DNS records.
Local DNS serer / default name server: (not strictly belong to the hierarchy) Each ISP has a local DNS server. When a host connecting to to an ISP makes a DNS query, the query is sent to the local DNS server, which acts a proxy, forwarding the query into the DNS server hierarchy.
Computer Networking: Application layer

文章图片

A typical DNS query chain:
1) The requesting host sends a DNS query message (containing the hostname to be translated, ‘gaia.cs.umass.edu’) to its local DNS server; [Recursive query]
2) The local DNS server forwards the query message to a root DNS server; [Iterative query]
3) The root DNS server takes note of the suffix (edu) and returns to the local DNS server a list of IP addresses for TLD servers responsible for that suffix;
4) The local DNS server resends the query message to one of these TLD servers; [Iterative query]
5) The TLD servers takes note of the suffix (umass.edu) and responds with the IP address of the authoritative DNS server for that suffix;
(In general, the TLD server may know only of an intermediate DNS server, which in turn knows the authoritative DNS server for the hostname.)
6) The local DNS server resends the query message to the authoritative DNS server; [Iterative query]
7) The authoritative DNS server responds with the IP address of the hostname to be translated.
Any DNS query can be iterative or recursive.
e.g. A DNS query chain for which all queries are recursive:
<

    推荐阅读