Squid Cache

HAVP Teamwork with other Proxys e.g. squid

Teamwork with another proxy (e.g. squid):

HAVP is Parent Proxy:

Use Havp with squid to benefit of squid ACLs (filter only some files e.g. exe, bat …). You can use havp as parent cache for squid so cached request don’t have to be scanned again. But this dos not improve performance all the time! Only if squid has the file already in cache it will be faster. Otherwise performance can drop because the file has to pass havp and squid. You should use a fast CPU if you use both.The disadvantage is that infected cached is no longer scanned by Havp.

You can create ACL lists with squid to control which traffic is scanned by squid. Please refer to the squid homepage to figure out the nice features of squid.

This is an easy configuration to use havp as parent cache. All traffic will scanned by havp.

/etc/squid/squid.conf

acl all src 0.0.0.0/0.0.0.0

cache_peer 127.0.0.1 parent 8000 0 no-query no-digest no-netdb-exchange default

cache_peer_access 127.0.0.1 allow all

#Only http traffic can be scanned 
acl Scan_HTTP proto HTTP
never_direct allow Scan_HTTP

Squid is parent Proxy:

You can also use squid as parent proxy for squid. Please check put the default.hand change.:

//Parent Proxy (Name)
#define PARENTPROXY “192.168.1.1”
#define PARENTPORT 3128 

Havp between Squid:

I got a mail Claer discussing havp between squid:

> —–Original Message—–
> From: Claer [mailto:claer@rax.homeunix.com] 
> Sent: Tuesday, August 23, 2005 4:51 PM
> To: christian@hilgers.ag
> Subject: HAVP and Squid
> 
> Hello,
> 
> I used to install Antivirus scanners (namely Trendmicro ISVW) with
> Squid.
> 
> When I saw the 2 solution you showed to install HAVP with Squid, it
> remids me the same reflexion some time ago.
> 
> Here is the solution I found to be the best :
> 
> Users -> Squid -> AV -> Squid -> Web Server
> 
> Explanations :
> – If you build the configuration with Users -> AV -> Squid -> Web
> Server, you will have problems to create statistics based 
> on Squid log
> analysis. 
> you spotted that ACL based on IP cannot be used anymore. It means
> also that control filter such as SquidGuard become unusable.
> 
> – If you build the configuration with Users -> Squid -> AV -> Web
> Server, you will have 2 problems.
> * First, you will have a load problem. With this configuration, you
> will have to maintain yourself all HTTP timeouts. You 
> cannot rely on
> the system TCP timeout. (I got the problem on Solaris, default TCP
> timeout is 7200 seconds. Wayyyy too much for Web traffic). Just
> check Squid timeout options, it’s surprising the number of options
> in this field.

Havp has timeouts so you don’t need the TCP timeouts but I guess squid is more comfortable.

> * Then, you’ll have the cache problem. If a user download a 
> file with
> a virus that is not yet detected with the current definitions, the
> virus get cached by Squid. When the detection rules get updated, a
> further request will extract the virus from the cache without
> checking it with the current antivirus definitions.
> 
> For all these reasons, I configure Squid with 2 listening ports. One
> port for users, one port for Antivirus requests.
> 
> An ACL check that requests for the first port are authentified and NOT
> cached. Then the http request is forwarded to the AV as parent.
> 
> An ACL check that requests from the second port are NOT authentified,
> came from the AV computer (often 127.0.0.1) and are cached.
> 
> With this configuration, I got another problem 🙂 Squid 
> detects that the
> same URL passed 2 times in its process. The awful solution I found was
> to patch Squid to not verbose URL loop informations. This way, I can
> use the safest solution I found for coupling an AV with Squid.
> Squid log files are analysable modulo a filter on the antivirus IP. As
> the analyser we used has this option, it’s not a problem.
> 
> If you are interested, I can give you a sample squid.conf anf 
> the line I
> commented from squid sources. I’m sure you will find a better 
> way doing
> this. All I can do with C code is commenting the problematic line 😉
> 
> Best regards,
> 
> Claer

Passing the traffic twice throught squid cost more CPU but in most cases this is ok.

Here is the Patch and the squid config file.

sample-squid.conf
squid-2.5.5-01-claer.patch

 


I got a email from Dirk Nehring. He told me that the squid patch is not needed.
This is is configuration:

##################################################
#
# /etc/squid/squid.conf
#
# Sandwich config for HAVP
#
# All options to configure are marked with “XXX”

#
# Header of squid.conf
#
# XXX visible_hostname proxy.domain.com

http_port 3128
icp_port 0

# scanning through HAVP
cache_peer localhost parent 8080 0 no-query no-digest no-netdb-exchange default

# Memory usage values
cache_mem 64 MB
maximum_object_size 65536 KB
memory_pools off

# 4 GB store on disk
cache_dir aufs /var/spool/squid 4096 16 256

# no store log
cache_store_log none

# Passive FTP off
ftp_passive off

# no X-Forwarded-For header
forwarded_for off

# Speed up logging
buffered_logs on

# no logfile entry stripping
strip_query_terms off

# Speed, speed, speed
pipeline_prefetch on
half_closed_clients off
shutdown_lifetime 1 second

# don’t query neighbour at all
hierarchy_stoplist cgi-bin ?

# And now: define caching parameters
refresh_pattern ^ftp: 20160 50% 43200
refresh_pattern -i \.(jpe?g|gif|png|ico)$ 43200 100% 43200
refresh_pattern -i \.(zip|rar|arj|cab|exe)$ 43200 100% 43200
refresh_pattern windowsupdate.com/.*\.(cab|exe)$ 43200 100% 43200
refresh_pattern download.microsoft.com/.*\.(cab|exe)$ 43200 100% 43200
refresh_pattern -i \.(cgi|asp|php|fcgi)$ 0 20% 60
refresh_pattern . 20160 50% 43200

#
# Access ACLs
#
acl manager proto cache_object
acl all src 0.0.0.0/0.0.0.0
acl localhost src 127.0.0.1/32

# XXX local networks
acl localnet src 10.0.0.0/8
acl localnet src 172.16.0.0/12
acl localnet src 192.168.0.0/16

acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 1025-65535 # unregistered ports
acl CONNECT method CONNECT
acl QUERY urlpath_regex cgi-bin \?
acl HTTP proto HTTP

# Do not scan the following domains
acl noscan urlpath_regex -i \.(jpe?g|gif|png|ico)$
# XXX acl noscan dstdomain proxy.domain.com

# We do not want traffic to these sites:
# XXX acl evil dstdomain www.veryevildomain.dom

#
# Applying ACLs
#
http_access deny manager
http_access allow localhost
http_access deny !localnet
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
#http_access deny evil

# For sandwich configuration we have to disable the “Via” header or we
# get a “forwarding loop”.
header_access Via deny all

# Do not cache requests from localhost, SSL-encrypted or dynamic content.
no_cache deny QUERY
no_cache deny localhost
no_cache deny CONNECT
no_cache allow all

# Do not forward parent requests from localhost (loop-prevention) or
# to “noscan”-domains or SSL-encrypted requests to parent.
always_direct allow localhost
always_direct allow CONNECT
always_direct allow noscan
always_direct deny HTTP

never_direct deny localhost
never_direct deny CONNECT
never_direct deny noscan
never_direct allow HTTP
———————————————————————-

Don’t forget (Disable Forwarding-Loop):

header_access Via deny all

* Mandatory locks: Use Loop-Devices instead of ramdisk:

%post
umask 022
grep -q havp /etc/fstab
if [ $? = 1 ]; then
cat <<EOF >>/etc/fstab
/var/havp.img /var/tmp/havp ext2 loop,mand
0 0
EOF
mkdir -m 750 /var/tmp/havp
chown %{havp_user}:%{havp_group} /var/tmp/havp
dd if=/dev/zero of=/var/havp.img bs=10240 count=512
mke2fs -F -q -m0 /var/havp.img
mount /var/tmp/havp
chmod 750 /var/tmp/havp
chown %{havp_user}:%{havp_group} /var/tmp/havp
fi