Skip navigation

Tag Archives: ubuntu

So, Sunday, I got Rosetta Code moved from its old VPS to its new VPS. Three substantive changes:

  • The server is now accessible via IPv6. This was mostly brought about by configuration changes.
  • It’s now running on Debian Lenny, instead of Ubuntu 10.04.
  • The vast majority of the PHP code load has been updated to reflect newer versions of software.

I’m most pleased at the Squid cache’s performance. Out of 139,566 requests on Monday, 95,726 were TCP_MEM_HITs, 15,529 were TCP_MISSes, 10,036 were TCP_IMS_HITs, 7,419 were TCP_HITs, 6,889 were TCP_REFRESH_UNMODIFIED, 2,873 were TCP_CLIENT_REFRESH_MISS and 1,093 were TCP_REFRESH_MODIFIED. All in all, that means roughly 81% of client page requests never got past Squid to Apache+mod_php. 68% of those requests caught were satisfied by data Squid still had cached in RAM.

In short, I’m vastly underutilizing this server. I need to start looking at network throughput data and figure out if I can migrate to a smaller, cheaper VPS. Prgmr.com is planning doubling RAM, CPU and disk offerings at no cost increase, but I don’t know if they’re doubling network quotas  as well.

Advertisements

I mentioned I wanted to set this up at Casey DuBois’ place, modeled after my own setup, and Jeff Bosch asked me to pass on the instructions, so he could set it up over at The Geek Group. While Casey’s garage, my network and TGG are very different networks, a basic setup is pretty simple, and should work for a broad range of configurations. So here it is, blogged, for the entire GRLUG to play with.

First, I’m assuming you’re running Debian. I don’t think there’s any notable distinction between Debian’s packaging of Squid and, say, RHEL5, and the configuration directives are almost entirely compatible (AFAIK) between squid2 and squid3, so the only necessary significant difference between my setup and one for those packages is likely to be how to retrieve and install the package, and where the configuration files are kept. (Drop a note in the comments if you’ve got an answer for those.)

The first step, obviously, is to install the package itself. There are two Squid packages, squid and squid3. Squid 3 is a ground-up rewrite of Squid in C++. The language it’s written in isn’t really important, though. What’s important is that Squid 3 is where future development and improvements are going, and there’s at least one practical consequence of that: Between the two, the squid3 package is the only one that has support for IPv6. That’s the only practical difference between the two that I’m aware of.

In Debian, if you want to install the newer Squid 3, you run:

apt-get install squid3

If you want the pre-rewrite edition, you run:

apt-get install squid

Next, we need to configure Squid. If you’re using Squid 3 on Debian, the Squid configuration will be found in:

/etc/squid3/squid.conf

If you’re using the previous version of Squid, the configuration goes in:

/etc/squid/squid.conf

Probably the hardest part about setting up Squid is getting the Access Control Lists, or ACLs, right. You need to define your ACLs in the file before you try to say who is allowed access to what. (You don’t, for example, want to allow someone on the public internet to use your Squid installation to connect to arbitrary hosts on your internal network!)

Here’s the first ACL in my squid.conf file:

acl manager proto cache_object

Later, this will allow us to say who can and cannot explicitly manipulate cached objects.

My next ACL defines ‘localhost’ as a source–if it comes from these IP addresses, it’s from the machine Squid is running on. Here, I show both IPv4-only and IPv6-supporting versions. Use only one of them.

acl localhost src 127.0.0.1/32 ::1 # Use this if you’re using squid3.
acl localhost src 127.0.0.1/32 # Use this only if you’re using squid2, not squid3.

Next, we define ‘localhost’ as a destination. Again, two versions are shown.

acl to_localhost dst 127.0.0.0/8 0.0.0.0/32 ::1 # Use this if you’re using squid3.
acl to_localhost dst 127.0.0.0/8 0.0.0.0/32 # Use this only if you’re using squid2, not squid3.

Now we come to some details that are particular to your network. I got a little fancy and distinguished between IPv4 and two scopes of IPv6 networks. You’ll want to change ‘localnet4’ and ‘localnet8gl’ to reflect your network’s numbering.

acl localnet4 src 192.168.22.0/24 # our local IPv4 network.
acl localnet8ll src fe80::/10 # RFC 4291 link-local (directly plugged) machines
acl localnet8gl src 2001:470:c5b9::/48 # our local IPv6 network.

If you’re running Squid 2, you’ll want to skip the localnet8ll and localnet8gl lines. If you’re running squid3, but you don’t have a public IPv6 prefix, then you’ll want to skip localnet8gl. (BTW, getting public IPv6 space is pretty easy and free.)

Now, we use the ACLs we defined to control who has access to what. These directives are checked in order, and the first match wins.

http_access allow manager localhost
http_access deny manager

First, we allow the local system access to manage Squid, and then we close the ‘manager’ feature from anyone further.

http_access deny to_localhost

Next, we deny any request claiming to access ‘localhost’. This way, nobody should be able to access any services on your machine by asking your Squid service “hey, go grab http://localhost/some_sensitive/web/service”.

http_access allow localnet4
http_access allow localhost
http_access allow localnet8ll # Don’t use unless you’re at least running squid3.
http_access allow localnet8gl # Don’t use unless you’re at least running squid3, *and* you have a piece of the global IPv6 address space.

Next, we allow anyone on our local networks to use the squid proxy server to connect out. Note the comments.

http_access deny all

Wrapping up the access controls, we deny anyone left access to any service we haven’t already covered.

Here’s a quick dump of what’s left in my configuration, and some basic explanation of it:

http_port 8123

Configure Squid to listen on port 8123.

hierarchy_stoplist cgi-bin ?

Recommended Squid default; it’s probably in your existing squid.conf if you didn’t obliterate the whole file.

cache_mem 1024 MB

Configure Squid to use up 1024MB of RAM for caching. This is in addition to any additional overhead it may have. I have Squid running on a router with 2GB of RAM. Excessive, probably, but it works nicely for me.

cache_dir aufs /var/spool/squid3 81920 16 256

Configure Squid to use 80GB of disk for caching, and tell it where to put those cached objects.

maximum_object_size 5120 MB

I’d like Squid to at least be able to cache a full Linux liveDVD ISO image, so I configure it to cache objects up to around 5GB.

cache_store_log /var/log/squid3/store.log

Where Squid puts its log files. (This is a default value for squid3 on Debian. You probably don’t need to change it.)

coredump_dir /var/spool/squid3

If Squid crashes, this is where it will put the dump. (Another default for squid3 on Debian. You get the idea.)

refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320

These control how long and likely Squid will consider a retrieved object ‘fresh’, meaning that it won’t go out and try to grab another copy of the file. These were default values on my installation.

quick_abort_min -1 QB

If I’ve aborted a program performing a download, I’d honestly like the proxy server to continue downloading the file. For me, it usually means I screwed up some parameter, and will be trying again shortly. Having the proxy server continue the download will mean much of the work will be already done by the time I try again. It may also mean that my system might continue downloading a DVD ISO image I’ve decided I don’t need, however, and cause my network to run slower. Your needs may vary from mine, so you may end up wanting to change this parameter.

read_ahead_gap 1 MB

How much data Squid should try to buffer on the client’s behalf. Tune this value if “buffer bloat” is relevant to you.

positive_dns_ttl 30s
negative_dns_ttl 1 second

Values controlling how long Squid should consider a DNS request fresh; this can significantly improve performance, especially if DNS is unreliable. (Though if it is, why aren’t you already running a local recursing DNS server as a cache?)

minimum_expiry_time 600 seconds
chunked_request_body_max_size 4096 KB

Some miscellaneous values tuning behavior of the HTTP protocol. You can probably leave these at their defaults.

With that configuration in place (and after you’ve restarted squid), your proxy server should be up and ready to run. Next up, you need to configure your client machines to use it. On Windows, follow KB135982, and tell non-IE applications to “Use system proxy settings.” On Linux, add these two lines to /etc/environment:

http_proxy=”http://hostname_or_ip_address_to_your_proxy_server:8123/”
ftp_proxy=”http://hostname_or_ip_address_to_your_proxy_server:8123/”

Replacing the obvious. If you used something other than 8123 (and you may have; Squid’s default is 3128, I’m just weird), then you should change the port specification here as well. After making those changes, restart your user session (or even your whole machine). There’s also WPAD, but I’ve never used that; you’re on your own there.