Saltar a contenido

12 - Networking Essentials

What this session is

About 45 minutes. You'll learn the network commands every Linux user eventually needs - fetch URLs, SSH to remote machines, copy files, see what's listening on ports.

Fetch a URL: curl

curl https://example.com                     # print to terminal
curl -o page.html https://example.com        # save to file
curl -I https://example.com                  # HEAD request (headers only)
curl -L https://bit.ly/something             # follow redirects
curl -X POST -d "name=alice" https://api.example.com    # POST request

curl is the universal HTTP client. Read its man page once; the flag inventory is huge but you'll use 5-10 of them regularly.

For JSON APIs:

curl -s https://api.github.com/users/octocat | jq

jq is a JSON processor. Install: sudo apt install jq / brew install jq. Pretty-prints and filters JSON. Pair with curl constantly.

wget is a simpler alternative for "just download this":

wget https://example.com/file.zip

ssh: log into remote machines

ssh user@host                # log in
ssh user@host "command"      # run one command and exit
ssh -p 2222 user@host        # custom port (default 22)

The remote shell prompt is yours. Whatever you type runs on the remote machine.

First time connecting to a host: SSH asks you to verify the host's fingerprint. Say yes (after, ideally, verifying out-of-band). The fingerprint is stored in ~/.ssh/known_hosts.

SSH keys: passwordless login

Type a password every time? Use a key pair instead.

Generate:

ssh-keygen -t ed25519 -C "your_email@example.com"

Saves ~/.ssh/id_ed25519 (private - keep secret, never share) and ~/.ssh/id_ed25519.pub (public - fine to share).

Copy your public key to the remote:

ssh-copy-id user@host

After: ssh user@host logs you in without a password.

Permissions matter: - ~/.ssh must be 700. - ~/.ssh/id_* private keys must be 600. - ~/.ssh/id_*.pub public keys can be 644.

Wrong permissions and SSH refuses to use the keys.

Copy files: scp and rsync

scp (secure copy):

scp file.txt user@host:/path/to/dest/        # local to remote
scp user@host:/remote/file.txt local-name    # remote to local
scp -r mydir user@host:/dest/                # recursive (directories)

rsync is much smarter - incremental, resumable, efficient over slow links:

rsync -avh source/ user@host:/dest/          # sync directory contents
rsync -avh --delete src/ dest/               # also delete dest files not in src
rsync -avh --dry-run src/ dest/              # show what WOULD change

-a = archive (preserves permissions, recursion, etc.), -v = verbose, -h = human-readable sizes.

The trailing / on the source matters: - rsync src/ dest/ - copy contents of src into dest. - rsync src dest/ - copy src itself into dest (as dest/src).

Use rsync for everything except trivial single-file copies.

What's listening on what port: ss

ss -tlnp                # TCP, Listening, Numeric, Process info
ss -tunlp               # also UDP

Shows which programs are listening on which ports.

Older command: netstat -tlnp. Same idea, deprecated in favor of ss.

sudo ss -tlnp           # needs sudo to show process info for other users' processes

What process owns a port: lsof -i

sudo lsof -i :8080      # what's on port 8080
sudo lsof -i tcp        # all TCP usage

Useful when "port already in use" - lsof tells you who's holding it.

DNS lookup: dig and nslookup

dig example.com
dig +short example.com           # just the IP(s)
dig example.com MX               # mail exchanger records
nslookup example.com             # older alternative

dig is the modern, scriptable tool. nslookup is older and still around.

Ping and traceroute

ping example.com                 # send ICMP echo; press Ctrl-C to stop
traceroute example.com           # show the route packets take

Useful for "is this host reachable?" and "where does the path break?"

On modern systems some of these are restricted; use mtr (combo of ping + traceroute, interactive) if installed.

Firewall: ufw (Ubuntu)

sudo ufw status                  # what rules exist
sudo ufw allow 22/tcp            # allow SSH
sudo ufw allow http              # allow HTTP (port 80)
sudo ufw enable                  # turn on the firewall
sudo ufw deny 23                 # block telnet

Beyond beginner; mentioned for awareness. Most desktop users don't manage their firewall manually.

A real session: SSH into a server, sync a directory

# One-time setup: generate key, copy to remote
ssh-keygen -t ed25519
ssh-copy-id alice@my-server.example.com

# Now SSH passwordless
ssh alice@my-server.example.com
# ... do stuff on remote ...
exit

# Sync a local dir to the server
rsync -avh --delete ~/projects/myapp/ alice@my-server.example.com:/srv/myapp/

# Or fetch a file from the server
scp alice@my-server.example.com:/var/log/app.log ./

A few useful patterns

Test a webhook endpoint:

curl -X POST https://example.com/webhook \
  -H "Content-Type: application/json" \
  -d '{"event":"test"}'

Download a tar archive and extract:

curl -L https://example.com/foo.tar.gz | tar -xz

The tar -xz extracts a gzipped tar from stdin.

Stream output from a remote command:

ssh user@host "tail -f /var/log/app.log"

tail -f on the remote, output streams to your local terminal.

Going deeper

The commands above connect you to remote machines. This is the depth that turns "the connection isn't working" into a layer-by-layer diagnosis - the failure modes everyone hits, with the exact error text and what each one means.

The connection-failure decision tree (read the error)

When a connection fails, the error message tells you which layer broke - and each points at a different fix. Stop guessing; read the words:

  • "Connection refused" -> you reached the host, but nothing is listening on that port (or a firewall actively rejected you). The host said "no." Check: is the service running? Is it on the port you think? ss -tlnp | grep <port> on the server.
  • "Connection timed out" -> packets went out and nothing came back - usually a firewall silently dropping (not rejecting) them, or the host is down/unreachable. The host said nothing. Check: firewall rules, security groups, is the host even up (ping).
  • "No route to host" -> the network layer can't find a path to the host at all - bad routing, wrong subnet, host genuinely offline. Earlier than the other two. Check: ip route, are you on the right network.
  • "Name or service not known" / "Could not resolve host" -> DNS failed - the hostname didn't resolve to an IP. The network might be fine; the name is the problem. Check: dig <host>, /etc/resolv.conf.

This is the single most useful networking skill: refused = nothing listening; timed out = firewall drop / host down; no route = routing/subnet; can't resolve = DNS. Four error strings, four different fixes. (This is the userspace view of the same DROP-vs-REJECT distinction the senior Linux netfilter investigation shows from the kernel side.)

Step-by-step: which layer is broken?

When something's unreachable, walk up the layers - each command rules out one:

# 1. Does the name resolve? (DNS layer)
$ dig +short api.example.com
93.184.216.34                    # got an IP -> DNS works. Empty -> DNS is the problem.

# 2. Can packets reach the host? (network layer)
$ ping -c2 93.184.216.34
2 packets transmitted, 2 received    # reachable. 100% loss -> firewall/down/routing.

# 3. Is the specific port open? (transport layer)
$ nc -zv 93.184.216.34 443
Connection to 93.184.216.34 443 port [tcp/https] succeeded!   # port open
# "Connection refused" here -> nothing listening on 443. "timed out" -> firewall.

# 4. Does the application respond correctly? (application layer)
$ curl -v https://api.example.com/health
< HTTP/1.1 200 OK                # the app is healthy

Each step isolates a layer. If step 1 fails it's DNS; if 2 fails it's network/firewall; if 3 fails it's the service or a port firewall; if only 4 fails it's the application. You go from "the network is broken" (useless) to "DNS resolves, host pings, port 443 is open, but the app returns 500" (precise, actionable) in four commands. This ladder is how professionals debug connectivity instead of flailing.

curl -v is an X-ray (read the whole conversation)

curl -v shows the entire request/response, and reading it diagnoses most HTTP problems:

$ curl -v https://api.example.com/data
* Trying 93.184.216.34:443...
* Connected to api.example.com (93.184.216.34) port 443      # TCP connected - network OK
* SSL connection using TLSv1.3                                # TLS handshake OK
> GET /data HTTP/1.1                                          # what you SENT (> lines)
> Host: api.example.com
> Authorization: Bearer xxx
< HTTP/1.1 401 Unauthorized                                   # what you GOT (< lines)
< WWW-Authenticate: Bearer error="invalid_token"

The > lines are your request, < lines are the response. Here the TCP and TLS layers are fine (so it's not a network problem) - the application returned 401 because of a bad token. Without -v you'd see a generic failure; with it, you see exactly which layer succeeded and which failed, and the server's own explanation. For TLS/certificate errors specifically, curl -v shows the handshake failing with the reason (expired cert, name mismatch) - turning "SSL error" into a specific cause.

What you'll see: a port already in use

A frequent real failure - you start a server and:

$ python3 -m http.server 8080
OSError: [Errno 98] Address already in use

Something already holds port 8080. Find it (the lsof/ss from earlier, applied to a real incident):

$ ss -tlnp | grep 8080
LISTEN 0 5 *:8080 *:* users:(("python3",pid=4127,fd=3))    # PID 4127 has it
$ # it's a leftover server you forgot - kill it:
$ kill 4127

"Address already in use" = another process owns the port; ss -tlnp (or lsof -i :8080) names the culprit, then kill it. (Occasionally it's a recently-closed socket in TIME_WAIT state - then you wait ~60s or set SO_REUSEADDR, but usually it's a zombie server, the networking cousin of the zombie process from the Processes chapter.)

SSH that won't connect - the common causes

SSH failures have their own diagnostic flag - -v (or -vvv for more):

$ ssh -v user@host

The verbose output shows where it stops: - "Permission denied (publickey)" -> your key isn't accepted. Is your public key in the server's ~/.ssh/authorized_keys? Are local key permissions too open? (SSH refuses keys if ~/.ssh or the key file is group/world-readable - chmod 600 ~/.ssh/id_ed25519, the permissions lesson applied.) - "Connection timed out" -> firewall/security-group blocking port 22 (the decision tree above). - "Host key verification failed" -> the server's identity changed (reinstall, or a real concern); the fix is removing the old key from ~/.ssh/known_hosts after verifying why it changed.

ssh -v showing exactly which step fails turns SSH from a black box into a readable handshake.

Try it (with what you'll see)

  1. Trigger each error: curl http://localhost:9999 (refused - nothing listening), curl http://192.0.2.1 (timed out - reserved unreachable IP), curl http://nonexistent.invalid (can't resolve - bad DNS). Read each error string and name the layer.

  2. Walk the four-layer ladder against a real site: dig +short, ping, nc -zv host 443, curl -v. Confirm each layer in turn.

  3. curl -v https://example.com and read the > (sent) and < (received) lines. Find where TLS succeeds and the HTTP status comes back.

  4. Start two servers on the same port; see "Address already in use"; find the holder with ss -tlnp | grep <port>; kill it.

  5. ssh -v to any host (even one you can't log into) and watch the handshake steps - key exchange, auth attempts - to see where it stops.

Exercise

  1. Fetch a URL:

    curl -s https://api.github.com/users/octocat
    
    Then pipe to jq if installed:
    curl -s https://api.github.com/users/octocat | jq
    

  2. DNS lookup:

    dig +short github.com
    

  3. See what's listening on your machine:

    ss -tlnp 2>/dev/null
    
    What ports does your computer expose?

  4. Generate an SSH key (if you don't have one):

    ls ~/.ssh/                              # check first
    ssh-keygen -t ed25519                   # if no id_ed25519 exists
    cat ~/.ssh/id_ed25519.pub               # your public key
    
    Copy the public key - you'll need it for GitHub (page 15) and any servers.

  5. Add your key to GitHub: GitHub Settings → SSH and GPG keys → New SSH key → paste your public key. After: ssh -T git@github.com should respond with your username.

  6. Bonus - rsync a folder to itself with --dry-run to see what would change:

    rsync -avh --dry-run --delete src/ dest/
    
    Useful before destructive syncs.

What you might wonder

"What's tmux for in this context?" SSH sessions die when your local connection drops. Run things inside tmux on the remote and they survive - reconnect with tmux attach. Indispensable for any remote work.

"What about nc (netcat)?" Low-level "make/accept TCP connections, send/receive bytes." Useful for testing services, transferring files when other tools aren't available. Niche but powerful.

"How do I serve a local directory over HTTP for quick sharing?"

python3 -m http.server 8000
Serves the current directory on port 8000. Open http://localhost:8000 in a browser. Great for sharing files on a LAN or testing.

"VPN, proxies, tunnels?" SSH itself can do port forwarding (ssh -L 8080:dest:80 user@host creates a tunnel). Beyond beginner; useful to know exists.

Done

  • Fetch URLs with curl (and maybe wget).
  • SSH to remote machines, with keys for passwordless.
  • Copy files with scp and rsync.
  • See listening ports with ss.
  • Look up DNS with dig.

You've now covered the core CLI skills. Remaining pages: how to apply this to OSS contribution.

Next: Picking a project →

Comments