What We Can Learn from Internet Email Headers

By: Shally Steckerl

This is a thorough explanation on how to read email headers. It occurred to me that some folks may be interested in knowing what we can learn from Internet Email headers. I thought this may be a great opportunity to share a few things about email with the recruiting community. If you already know this stuff, can explain it better, or are not interested - my apologies and please stop reading! This is intended as an overview. I am experienced in this matter from my own research after receiving malicious SPAM on several occasions, but I am by no means a definitive authority on this subject.

The email header is text encoded in the message, but that we don't actually see. In Outlook, while looking at the email message, go to the View menu then the Options button and you should see something like the following attached (from a sample message):

----- ACTUAL MESSAGE HEADER -----
Return-Path:
Received: from unknown ([151.202.177.115]) by imf04bis.bellsouth.net
(InterMail vM.5.01.01.01 201-252-104) with SMTP
id <20010809213458.KISJ11435.imf04bis.bellsouth.net@unknown>;
Thu, 9 Aug 2001 17:34:58 -0400
From: Subject: CICS, COBOL II, DB2, VSAM and MVS-ES. - 3 Yrs with Merrill Lynch
Date: Thu, 9 Aug 2001 14:00:17 Message-Id: <39.231076.732715@>

SHALLY'S TRANSLATION

jobsintermedia@aol.com is the sender (using Intermedia as the mail provider)
Their IP address is 151.202.177.115
They use Bell Atlantic (NETBLK-BELL-ATLANTIC1)
Netname: BELL-ATLANTIC1
Netblock: 151.196.0.0 - 151.205.255.255
Maintainer: BAIS Their website or ISP host is:
Verizon Global Networks Inc. (ZV20-ARIN) noc@gnilink.net
(703) 295-4583
imf04bis.bellsouth.net is recepient's mail server

The "unknown" and blank space after the @ in the following lines tell me this is a forged email, the user is manipulating their headers intentionally.

20010809213458.KISJ11435.imf04bis.bellsouth.net@unknown
39.231076.732715@

This is email fraud, if you want to get serious about it. My educated guess is that its an H1-B dependant shop spamming recruiters with candidates. You can copy the header and send it to the offender's ISP for analysis - if you think its worth the trouble. You can also send me a header sample and I can help decode it to help you decide if its worth pursuing. Superficially, it appears that email is passed directly from the sender's machine to the recipient's. Normally, this isn't true; a typical piece of email passes through at least four computers during its lifetime. This happens because most organizations have a dedicated machine to handle mail, called a "mail server." It's normally not the same machine that users are looking at when they read their mail. In the case of an ISP whose users dial in from their home computers, the "client" computer is the user's home machine, and the "server" is some machine that belongs to the ISP. When a user sends mail, she normally composes the message on her own computer, then sends it off to her ISP's mail server. At this point her computer is finished with the job, but the mail server still has to deliver the message. It does this by finding the recipient's mail server, talking to that server and delivering the message. It then sits on that second mail server until the recipient comes along to read her mail. When she retrieves it on her own computer, normally it is deleted from the mail server in the process.

Example Spam Message (fictitious names and ID's used to protect privacy):

Diane sends a letter to and association, she composes it at her workstation (which is called diane@someisp.com.) The composed text is passed from there to the mail server, someassociation@yahoogroups.com or mq.egroups.com. This is the last Diane sees of it. The rest is handled by machines with no intervention from her. The mail server, seeing that it has a message for someone at egroups.com, contacts its mail server---called, hypothetically, l10.egroups.com---and delivers the mail to it. Now the message is stored on l10.egroups.com until yahoogroups processes it.

During all this processing, headers will be added to the message three times:
1. At composition time, by Diane's Outlook;
2. When that program hands control off to mail.yahoogroups.com; and
3. At the transfer from egroups to yahoo.

Let's watch the evolution of these headers. As generated by Diane's mailer and handed off to yahoogroups.com:

From: "Diane Smith" To: Date: Tue, Mar 18 1997 14:36:14 PST X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) Subject: [someassociation] Porno Spam

Here is the header when the email system at Diane's someisp.com account transmits the message to the mail host at yahoogroups.com (which is mq.egroups.com ).

Received: from Smiths (A010-1198.PHNX.splitrock.net [209.254.234.182]) by pimout4-int.someisp.com (8.11.0/8.11.0) with SMTP id f6EHrsk152696 for ; Sat, 14 Jul 2001 13:53:54 -0400 To:
Message-ID:
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
From: "Diane Smith"
Date: Sat, 14 Jul 2001 10:51:40 -0700
Subject: [someassociation] Porno Spam

When mq.egroups.com finishes processing the message and gives it to a prodigy server called pimout4-int.someisp.com and adds the first line:

Received: from unknown (HELO pimout4-int.someisp.com ) (207.115.63.103) by mta1 with SMTP; 14 Jul 2001 17:53:56 -0000

Received: from Smiths (A010-1198.PHNX.splitrock.net [209.254.234.182]) by pimout4-int.someisp.com (8.11.0/8.11.0) with SMTP id f6EHrsk152696 for ; Sat, 14 Jul 2001 13:53:54 -0400 To:
Message-ID:
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
From: "Diane Smith"
Date: Sat, 14 Jul 2001 10:51:40 -0700
Subject: [someassociation] Porno Spam

This last set of headers is the one that I see on the letter when I download and read mail. Here's a line-by-line analysis of these headers and exactly what each one means.

Received: from mq.egroups.com ([208.50.144.79]) by ehost002.intermedia.net with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id N8JKS1JG; Sat, 14 Jul 2001 10:57:50 -0700

This piece of mail was received from a machine calling itself mq.egroups.com with the IP address [208.50.144.79]. It was received by ehost002.intermedia.net which is running MS Exchange. The receiving machine assigned the ID number N8JKS1JG to the message. (This is used internally by the machine---it's something an administrator would need to know to look up the message in the machine's log files, but it's not usually meaningful to anyone else.)

Received: from Smiths (A010-1198.PHNX.splitrock.net [209.254.234.182]) by pimout4-int.someisp.com (8.11.0/8.11.0) with SMTP id f6EHrsk152696 for ; Sat, 14 Jul 2001 13:53:54 -0400

The message was addressed to someassociation@yahoogroups.com. Note that this header is not related to the To: line.

Sat, 14 Jul 2001 13:53:54 -0400

This mail transfer happened on Sat, 14 Jul 2001 13:53:54 -0400 (1:53:54 in the afternoon) Eastern Standard Time (which is 4 hours behind Greenwich Mean Time; hence the "-0400").

Received: from unknown (HELO pimout4-int.someisp.com ) (207.115.63.103) by mta1 with SMTP; 14 Jul 2001 17:53:56 -0000

This line documents the mail handoff from someisp.com (Diane's workstation) to mta1; this handoff happened at 14 Jul 2001 17:53:56 -0000 (10:53:56 Phoenix time). The sending machine called itself mq.egroups.com; it really came from pimout4-int.someisp.com , and its IP address is 207.115.63.103.

From: "Diane Smith"

The mail was sent by diane@someisp.com, who gives her real name as Diane Smith.

To: The letter is addressed to someassociation@yahoogroups.com.

Date: Sat, 14 Jul 2001 10:51:40 -0700 The message was composed at 10:51:40 Arizona Time on Saturday, July 12, 2001.

Message-ID:

The message has been given this number by prodigy to identify it. This ID is different from the SMTP and ESMTP ID numbers in the Received: headers because it is attached to this message for life. The other IDs are only associated with specific mail transactions at specific machines, so that one machine's ID number means nothing to another machine. Sometimes, as was the case in this example, the Message-ID has the sender's email address embedded in it; more often, it has no intelligible meaning of its own.

X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)

The message was sent using MS Outlook, and even gives you the build number.

Regular Mail Protocols

This section is a little more technical than the others, and focuses on the details of how mail gets from one point to another. You don't need to understand every word, but familiarity with this subject can do a lot to clarify what's happening in strange situations. Since email spammers often intentionally create such strange situations (partly to confuse their victims), the ability to understand those situations can be quite helpful.

To communicate over a network, computers often use "points of entry" called ports. You might think of a port as a channel like on your TV or Radio through which computers listen to communications from the network. To listen to many communications at once, a computer needs to have multiple ports; to distinguish them, they're generally numbered. On systems connected to the Internet (or any systems using the same protocols for email), port 25 is of particular importance for the present discussion. That's the port used to transmit and receive mail. Port 80 - as an interesting note, is where most of your web browsing occurs.

Normal Behavior

Let's return to the example of the last section, and specifically to the point where prodigy communicates with egroups. What really happens here is that pimout4-int.someisp.com opens a connection to port 25 of mq.egroups.com, and sends the mail through that connection, along with some administrative data. The commands it uses to do this, and the responses issued by the receiving system, are more or less human-readable. They're commands in a rudimentary language called SMTP, for Simple Mail Transfer Protocol. Someone eavesdropping on the "conversation" between the machines would see something like this:

220 ehost002.intermedia.net with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id N8JKS1JG; Sat, 14 Jul 2001 10:57:50 -0700
HELO pimout4-int.someisp.com 250 ehost002.intermedia.net
Hello pimout4-int.someisp.com [207.115.63.103], pleased to meet you
MAIL FROM: diane@someisp.com
250 diane@someisp.com... Sender ok
RCPT TO: shally@jobmachine.net (it was actually sent from yahoogroups to me)
250 shally@jobmachine.net... Recipient ok
DATA
354 Enter mail, end with "." on a line by itself
Received: from mq.egroups.com ([208.50.144.79]) by ehost002.intermedia.net
ehost002.intermedia.net (8.11.0) id N8JKS1JG; Sat, 14 Jul 2001 10:57:50 -0700
From: diane@someisp.com (Diane Smither)
To: shally@jobmachine.net
Date: Sat, 14 Jul 2001 10:51:40 -0700
Message-ID:
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Subject: [someassociation] Porno Spam

I need some advice/direction on how to block pornographic Spam. I get soliciting emails that contain inappropriate content and language. I'm no prude, but this stuff is not what I want to receive on my email.

Thanks,


Diane Smith
BBS Consulting
.
250 LAA20869 Message accepted for delivery
QUIT
221 ehost002.intermedia.net closing connection

This whole transaction depends on five commands at the core of SMTP (there are a few others, but they're sideline issues to the actual process of passing mail from one place to another): HELO, MAIL FROM, RCPT TO, DATA, and QUIT.

HELO identifies the sending machine; "HELO pimout4-int.someisp.com " should be read as "Hello, I'm pimout4-int.someisp.com ". The sender can lie; nothing, in principle, prevents mail.bieberdorf.edu from saying "Hello, I'm thefonz.xyz.gov" (HELO thefonz.xyz.gov) or even "Hello, I'm a misconfigured computer" (HELO a misconfigured computer). However, in most circumstances, the receiver has some tools with which to discover this and find out the sending machine's real identity.

MAIL FROM initiates mail processing; it means "I have mail to deliver from so-and-so". The address given turns into the so-called "Envelope From"--it need not be the same as the sender's own address! This apparent security hole is inevitable (after all, the receiving machine doesn't know anything about who has what username on the sending machine), and in certain circumstances it turns out to be a useful feature. RCPT TO is dual to MAIL FROM; it specifies the intended recipient of the mail. One piece of mail can be sent to multiple recipients simply by including multiple RCPT TO commands (see the section below on mail relaying, which explains how this feature is sometimes abused on insecure systems). The given address turns into the so-called "Envelope To." It actually determines who the mail will be delivered to, regardless of what the To: line in the message says.

DATA starts the actual mail entry. Everything entered after a DATA command is considered part of the message; there are no restrictions on its form. Lines at the beginning of the message (before the first blank line) that start with a single word and a colon are considered to be headers by most mail programs. A line consisting only of a period terminates the message. QUIT terminates the connection.

SMTP is fully defined in RFC 821. Copies of the RFCs are widely available on the Web. Its well worth reading, as it sheds much light on the intricacies of mail processing.

Unusual Scenarios

The scenario above is a little bit oversimplified. The biggest assumption is that the mail servers of the two organizations involved have free access to one another. This was almost always true in the early days of the Internet, and it's still sometimes the case today, but as security has become a greater concern, and as organizations and networks have gotten bigger sometimes requiring many separate mail servers, it has become more and more unusual.

Firewalls

Many, perhaps most, organizations with computers on the Internet are protected by some kind of firewall. A firewall is just a computer whose primary job is to act as a gatekeeper between an organization's own machines and the great unwashed world of the net (so that, for instance, crackers can't easily connect to a piece of IBM's corporate network and start stealing corporate secrets). From the standpoint of another computer trying to deliver mail to a system behind a firewall, what this means is that you can't talk directly to the system; you have to talk to the firewall.

No surprises here; this just introduces another "hop" in the journey of a piece of email, with the firewall acting as just another machine that passes mail. If ehost002.intermedia.net had a firewall in place, here's what the headers from our sample piece of email might look like. Notice the first Received: line. Lets pretend that the firewall machine is named firewall.ehost002.intermedia.net; in fact, giving a machine a name like "firewall" is tantamount to inviting every teenage cracker-wannabe in the world to try to break in, so firewalls usually have perfectly ordinary, innocuous names.)

Received: from firewall.ehost002.intermedia.net (firewall.ehost002.intermedia.net
[121.214.13.129]) by ehost002.intermedia.net (8.11.0/8.11.0) with SMTP id
LAA20869 for ; Sat, 14 Jul 2001 10:57:50 -0700
Received: from mq.egroups.com ([208.50.144.79]) by
firewall.ehost002.intermedia.net (8.11.0/8.11.0) with ESMTP id LAA20869 for;
Sat, 14 Jul 2001 10:57:50 -0700

In similar fashion, if all outgoing mail from intermedia.net (my Exchange host for jobmachine.net) were routed through a firewall, there would be another Received: line inserted by that firewall machine. By the same token, there might be machines involved that aren't strictly firewalls, but simply common points for routing. Intermedia.net may maintain machines in many physical locations, with several separate mail servers. It may use a single machine (called, for example, mailgate.intermedia.net) to decide which server incoming mail should be routed to.

The history of the message can be reconstructed by reading the Received: headers from bottom to top: pimout4-int.someisp.com received it from Smiths who is (A010-1198.PHNX.splitrock.net at IP address 209.254.234.182). Pimout4-int.someisp.com then sent it to mta1, which in turn, routed it to mq.egroups.com (probably the egroups master email server) which sent it to 10.1.4.56 from where it went to ehost002.intermedia.net, which knew how to get hold of my inbox.

Click here to read more of this article.