Showing posts with label Computers Channel. Show all posts
Showing posts with label Computers Channel. Show all posts

Tuesday, January 14, 2020

How TO **TROUBLESHOOT** A Network By Accident!





TP-Link (TL-SG1016D) Accidental Network Error Discovery!

The other day I saw a used GIGABIT switch for sale on Carousell, Hong Kong's local version of eBay + Amazon.  I bid on it and I won!  Yaay!

When the unit arrived, I did an unboxing video - but switches are pretty boring devices and there's not much material to do a "show and tell" with.  


So, at the suggestion of Mrs. Maker, I also did a "teardown" video where I took the switch apart in search of a moving part (like a fan) to check on, like with a 3COM switch I refurbished almost 10 years ago

No dice.  The TL-SG1016D is a passively cooled device, which means that it has heat sinks and uses airflow and convection to keep itself cool.  Maybe in a future video I'll install a miniature fan, but first I would need to determine that such hackery is needed by using a thermal imaging system (like a FLIR camera).

Anyways, I did my best to bulk up the content by taking the switch apart, putting rubber feet on it and so on...and then ran out of stuff to say...

So I went and put it into service, fully expecting a no-drama situation - only to have a "WTF!" moment when I realized that one of the ports on the switch (PORT 4) was not reporting a Gigabit connection when it should have!




After a little bit of investigation, I found out that the cable in question was not working properly.  The fix was pretty simple.  I swapped out the ethernet cable for a different one.

After that, the link came up at the correct speed:



So what had originally started off as a pretty banal video about a very simple device with no moving parts, no problems and practically zero user experience, I discovered a mis-behaving part of my network that was negatively affecting Mrs. Maker's enjoyment of her computing experience.  

Fixing this problem made me a hero to Mrs. Maker!   Mostly because I somehow magically improved the (perceived) speed of her computer...So...THANKS TP-LINK!


Help a Maker Out?


Did you see anything in this video you want for yourself? If the answer is yes, consider using one of the link(s) below to buy it directly from amazon.com. When you do, I get a small commission that keeps me going!

Bill of Materials


TP-Link 16-Port Gigabit Ethernet Unmanaged Switch | Plug and Play | Metal | Desktop/Rackmount | Fanless | Limited Lifetime (TL-SG1016D)








Tuesday, January 7, 2020

Dell M4700 REBUILD #5 - I Bought An **IMPOSTOR** Graphics Card!





The TaoBao Impostor Video Card


In my ongoing efforts to bring CRAZY, my Dell PRECISION M4700 laptop back to life, I decided that swapping out the video card would make a certain amount of sense, if only at the right price.

Not knowing much about this type of Graphics Card (MXM TYPE A), I looked at the AMD GPU processor model (216-0834044) on Dell OEM MXM TYPE A Graphics Card (#3YF07) model number, 109-C42251-00A.



This card identified in Windows as an AMD FirePro M4000 with 1Gb of RAM.  So, I then performed a search for a card of that type on the source of cheapest stuff I know of, TaoBao.  For those of you who don't know what TaoBao is, it's kind of like a mashup of eBay and Amazon in China.  It's by far the largest eCommerce marketplace in China, and it is owned by Alibaba, the monster company started by Jack Ma in 1999.  Alibaba is the biggest eCommerce company in the world - bigger even than eBay or Amazon.

Anyway, looking around in TaoBao for an inexpensive MXM TYPE A Graphics Card, I came across an interesting listing.  The low price made buying it easy (¥76.00 / USD10.91), so I went ahead with the order!

Once the item arrived, everything looked OK.  The card came very well packaged, and the seller was thoughtful.  They even included a dab of thermal grease in a second bag, just in case I didn't have any on hand.

Even though it looked the same as the Dell OEM card - it had 4 x Hynix H5GQ1H24BFR and the same AMD GPU (#216-0834044) as the Dell Laptop Graphics Card I had previously taken a blowtorch to - this card turned out to be a little bit of a mystery.  The first thing I noticed was that it had these red-colored brackets around the GPU, and a different model number (# 109-C56351-00_02):



Eventually, using its model number, I was able to determine that this card was actually an OEM card produced for the HP Elitebook, the 8770W, and maybe only had 512Mb onboard, which meant that while it looked all the world like its Dell counterpart, in reality it had only 50% of the video RAM and was therefore (gently) an IMPOSTOR Graphics Card for my MXM TYPE A compatible Dell PRECISION M4700!


Well, we shall see if the Dell PRECISION M4700 likes, recognizes and configures the 8770W, and is able to use this card as it could its OEM Graphics Card, which was physically (but not technically) identical to this replacement. Installation was a snap, because the two cards share the same form factor.





Saturday, January 4, 2020

Dell M4700 REBUILD #4 - Enter The BLOWTORCH!




Why I Am Using A Blowtorch On A Video Card


In my continuing campaign to bring CRAZY, my Dell PRECISION M4700 laptop, back into the land of the living, I decided to try reflowing the GPU on its MXM TYPE A Daughtercard, an AMD FirePro 4000 (Model: 216-0834044).

This required me to (once again) dis-assemble the computer.  Once I got it stripped down to the point where I could remove the card, I did so and placed it on a wet cloth. 

I then took a cigar torch and played the flame of the torch on the GPU, always being careful to move the flame around.  The chip got so hot that the solder underneath began to boil, so I removed heat and let the card stand (without moving it) for 10 minutes, at which point it was cool to the touch.


The whole point of this was to eliminate any thermal cracking of the solder balls that the AMD card is "floating" on, something called BGA (Ball Grid Array).  This has become more and more popular as devices get smaller and smaller, and wave soldering techniques have become more precise and exact. 

With many Very Large Scale Integrated (VLSI) chips, pins are now a thing of the past.  Instead, large chips are increasingly placed on Printed Circuit Boards (PCB) using solder balls that connect a "landing pad" array on the board and the chip, in what is called a Ball Grid Array (BGA) packaging type.This is a perfectly good packaging type, and it comes with several advantages, the foremost of which is a very slim form factor.  

Unfortunately, advances in solder have not keep apace.  Solder is much less flexible than it used to be, because of ongoing efforts in the electronics industry to respond to calls that it needs to reduce its toxic chemical content.  Lead, an extremely flexible (and ductile) element of solder, is highly toxic to humans, so it has been gradually phased out - leaving solder less flexible than it used to be.

The problem that emerges after a few years (especially in high-heat situations) is the solder balls that form the BGA begin to develop micro cracks due to their constant expansion and contraction as they are heated up and cooled down.  Over time, the cracking can become so bad, flaky behaviour and intermittent failures begin to crop up as the temperature of the chip fluctuates, leading to a complete failure.

See the video of how I attempted to reflow an AMD Firepro 4000 with a mini-blowtorch!





Help a Maker Out?


Did you see anything in this video you want for yourself? If the answer is yes, consider using one of the link(s) below to buy it directly from amazon.com. When you do, I get a small, but very encouraging commission that helps me keep going! 


Bill of Materials


Micro Butane Torch Lighter,Kitchen Craft Cook's Blow Torch Professional Grade Culinary Blow torch for Cooking & Baking Camping Welding Flamethrower BBQ Outdoor https://amzn.to/39yGT97



Friday, January 3, 2020

Dell M4700 REBUILD #3 - (Thermal) Grease is the WORD!

Dell PRECISION M4700


Dell M4700 REBUILD #2 - (Thermal) Grease is the WORD!


In Episode 2 of the CRAZY Dell M4700 Saga, instead of putting the machine back together - I take it even more apart.

Why, you may ask?

Well, because the old heat sink compound (grease) on the GPU daughter card was so very crusty and burned.


Thermal Grease isn't supposed to look like a fried egg!


Considering this situation with respect to the GPU, it was pretty clear to me that the laptop probably needed its Thermal Grease freshened everywhere, including the CPU area.  Time to break out the screwdrivers, and dig even further into the guts of CRAZY!

What we are trying to do is:

1) Determine the probable conditions that caused CRAZY to fail

- My guess is CRAZY overheated because its heat dispersal system was blocked with lint

2) Pinpoint exactly which component or sub-system is in a failed state

- My guess is some part of CRAZY's graphics subsystem failed

3) Remediate, Repair and/or Replace the failed component or sub-system

- The first thing I did was to remove the lint from the heat dispersal system, which resolved the problem short-term

- What I am doing now is attempting a longer-term fix by digging even deeper into the heat dispersal system, to the point where the GPU makes contact with its heat pipe.

4) Prevent this (or a similar) component or sub-system from failing again in the same place and way, or a similar way

- While digging deeper into the GPU failure, a concern about a similar, heat-related failure with respect to the CPU has cropped up, so I am also replacing the Thermal Grease on the CPU out of an abundance of caution.

Seeing as one of the major enemies of laptops is heat, and considering that I already had the machine open, it made sense to me to also freshen its CPU Thermal Grease, considering that the existing stuff had been in there, untouched, for almost a decade!



Help a Maker Out?


Did you see anything in this video you want for yourself?  If the answer is yes, consider using one of the link(s) below to buy it directly from amazon.com.  When you do, I get a small, but very encouraging commission that helps me keep going!

Bill of Materials

ARCTIC MX-4 - Thermal Compound Paste For Coolers | Heat Sink Paste | Composed of Carbon Micro-particles | Easy to Apply | High Durability - 4 Grams

Thursday, January 2, 2020

Dell M4700 REBUILD #2 - FAKE Dell Laptop Drive Not Found Error (w Explanation)

Dell PRECISION M4700




Dell M4700 REBUILD #2 - FAKE Dell Laptop Drive Not Found Error (w Explanation)


I cam across a very interesting situation the other day when working on CRAZY, my Dell PRECISION M4700 laptop.  

For those of you who do not know (or remember) what CRAZY is, it's a DELL PRECISION m4700 laptop that I bought out of the back of a car in a Great Canadian Super Store parking lot from a guy operating under an assumed name (no kidding!).

Although it worked OK at the time of purchase, within a month the machine started to exhibit some strange video behaviour, which was resolved by taking it apart and clearing its airways.  Once the airflow was cleared, the problem went away...mostly....

Turns out it wasn't 100% fixed.  It still exhibited problems, but so infrequently that the person using the laptop didn't treat it like a big deal and either waited for the video to settle down, or simply rebooted the machine, at which point the glitch vanished.

About a month ago, approximately a year and a half post-purchase, CRAZY froze one afternoon.  CTRL-ALT-DEL did nothing, the machine had to be "cold" rebooted by pressing down the power button for about 10 seconds.

After that...nothing...the machine refused to do anything.  It didn't even display the DELL logo startup screen at which you can choose to press F2 for BIOS setup, or F12 for boot device selection.  Nada.  It just sat there with the power indicator on.

So I went into testing mode.  One of the things I tried was swapping the hard drive between that machine and a spare laptop I had purchased just in case of this very thing happening.

But it looked like the crash had also corrupted the hard disk in CRAZY (very badly!) because in the new machine the relocated drive WAS NOT EVEN RECOGNIZED.  There was no hope of booting the drive in the new machine, which caused a lot of extra work as applications needed to be re-installed.

There was also the issue of data loss.  There were some important files on that hard disk, so an extra effort had to be made to try to recover them.  So I went to the computer store and bought an external ATA case that I could connect to a working computer via USB.

This solution worked perfectly.  As in flawlessly.  Which made me start to wonder if there wasn't something else going on with respect to my previous troubles when it came to the DRIVE NOT FOUND error that I had encountered with the standby Dell PRECISION M4700. 

So I looked into the situation once again, without the anxiety and pressure of having just experienced a catastrophic hardware failure...and here's what I found:

It's actually possible now to mis-install a Dell SSD into the drive tray for the Dell PRECISION M4700 to the point where it falls under the internal SATA connector, at which point it looks like it has been installed, but in actual fact it is not!  Unlike with previous (thicker) magnetic media, it is possible to fully insert a Dell drive tray into a Dell laptop with a Dell SSD mis-installed to the point where it makes NO physical connection with the motherboard...and not know that you did it!

This video shows in detail how someone (like me) can fall victim to this problem, how to avoid it...and what to check.







Tuesday, December 31, 2019

Dell M4700 REBUILD #1 - My Laptop Is **DEAD** But Not For Long!

DELL PRECISION M4700



My **CRAZY** Dell Is DEAD - But Not For Long!





I bought CRAZY, a second-hand DELL PRECISION M4700, for CAD350 (USD267.51) in the early days of June 2018 out of the back of a car in the parking lot of an Ottawa Real Canadian Super Store...no kidding


I was responding to an ad with the following headline:


Dell Precision M4700 Mobile Workstation Laptop w/ Firepro m4000

I found it on Kijiji, Canada's version of Craigslist.  

Here's the general specifications of the machine:


  • Dell Precision M4700
  • CORE i7 CPU (i7-3520@2.9Ghz)
  • Firepro m4000
  • 8Mb RAM
  • 128Gb SSD

So, I got in touch with the seller and planned a buy:



The person selling the computer, "Bob Walter", was obviously using an assumed identity, because he looked about as much like a "Bob Walter" as I look like a "Xi Jin Ping".

Anyway, despite its somewhat turbid provenance, the DELL PRECISION M4700 seemed to work just fine when I checked it out.  The battery was OK.  It seemed to be in fine cosmetic shape.  It came with a 128Gb SSD.  At the time of the purchase, the machine booted correctly and brought up a Windows desktop quickly and smoothly.  There was no obvious damage.  So, I was pretty pleased with the buy.

But, very shortly thereafter, even before the summer was over, it started to develop a very strange video problem that looked like this:


Taking a look at the machine, I noticed that the machine appeared to have a heat problem, because the right side of the keyboard was becoming very hot to the touch.  Too hot.

There are two fans on the DELL Precision M4700.  Looking at the screen, the one on the left cools the CPU, the one on the right cools the GPU.  Servicing the fans is pretty easy, you just flip the machine over, release the battery, unscrew a couple of baseplate screws, remove the baseplate and there you are.  The fans are in plain view (but reversed, the GPU fan is now on the left instead of being on the right, because the machine is upside down).


In the end, it turned out that the machine had a serious GPU cooling issue.  The GPU airflow was being blocked by what looked like a large clump of fluff and cat hair caught in (and hidden by) the exit port heat dispersal fins.  So, I removed the blockage and blew out the machine with compressed air.   The machine booted smoothly as always, and there was no more flickering video.  I thought I had resolved the issue...but I was wrong.

As it turns out, the primary user of the machine was still experiencing flickering video issues, just much more rarely.  When interviewed about this after the system failure, they said "When the video turned off completely, I would just wait a little while and it would come back on - it happened so rarely, it just didn't bother me and I didn't think about it any more".  

Hmmmm - that's not really an acceptable MAN THE MAKER situation...

One day, the machine simply froze.  Not thinking much of the situation, the primary user performed a hard restart and....nothing.  The machine powered up and then then just sat there doing absolutely nothing.  No beeps!  No nothing!  Just a power indicator above the keyboard and nothing else.  

The other thing that happened when this machine crashed is it somehow managed to scramble the contents of its SSD to the point where the disk became unrecognizable by any other computer.  I had to buy an external USB enclosure to get it recognized and the data off the disk and onto a replacement computer.

So, with the machine much deeper in trouble, back into troubleshooting mode I went.

I did the following to try to isolate the source of the problem:

1) I removed all of the memory, to checlk if there was a memory error and to elicit some beeps from the computer.  No beeps.

2) I removed the CMOS battery, thinking that there might be some weird CMOS setting with respect to the HDD preventing the machine from booting, but got nothing.

Because I was a  bit short on time, I called a local laptop repair shop and asked them to take a look at the machine for me.  

I don't normally do this - but I was about to travel overseas and it was imperative to get this machine back online as soon as possible.  

I told the technician everything I could about the machine (including the story of the blocked GPU fan) and said that I suspected that it was the graphics card that was faulty.  The technician called me later on the same day and said that a replacement of the graphics card would cost about USD75.00, labor included.  I told him to go ahead and get one, put it in, and tell me whether or not that fixed the problem.

I returned from my travels about a week later, only to be told by the technician at the laptop repair shop that he had been unable to repair the machine.  Furthermore, he wasn't sure what was wrong with it.  He told me that after he had swapped in a new graphics card, the machine still refused to boot.  

At that point, I asked him to box the machine up so I could come and get it.  The technician was kind enough to box the machine up quite nicely, and I went and got it without any issues - only to realize that he had forgotten to give me the bottom plate of the laptop, which bears the DELL SERVICE CODE I needed to get the "as built" specifications of the machine.

Once again short on time, I put it away - but promised myself that I'd take another look at it when I had a little more free time.  
Well, that time is now!

When I got the machine back I noticed something about it right away.  The thermal compound that had been applied to help conduct heat from the GPU to its heat sink was now the consistency of hardened drywall plaster!




In fact, the area where the GPU came into direct contact with the heat sink had baked the interfacing thermal grease A DARK BROWN.




Clearly, no replacement card had ever been installed.

The technician had told me a fib.  Who knows why...

This left me in a bit of a pickle, because I didn't know where the fault truly lay:


Potential Fault Origins:


A)  Was it the GPU board? 

- Was it fried permanently from having not been cooled correctly for years?
- Was it just overheating the instant the  machine was powered up?


B)   Was it the CPU?

- If the GPU thermal paste was bad, surely the CPU paste was in the same state?
- Could the CPU be fried?

C)  Was it the motherboard?

- After all, the machine was 5 years old and suffering from thermal issues

Using Occam's Razor as my guide, I figured that the first component to mess with should be the GPU card, because the machine had a history of video problems, and those problems diminished (but did not disappear) when the cooling subsystem had been straightened out.

So, I figured that the simplest thing to do was to:


Re-assemble the computer 

- With the  same Graphics Card
- With fresh thermal paste
- With clear airflow

How To Remove Thermal Paste (Use Rubbing Alcohol!)


Removing thermal paste is easy if you use rubbing alcohol and a bunch of Q-tips.  Simply apply the rubbing alcohol generously to the Q-Tip (I dip the Q-Tip directly into the bottle) and then rub the hardened thermal paste in a circular motion until it starts to dissolve.  

After a few minutes of this, an AMD chip face emerged:



Here's some specifications:

Part #:            216-0834044
Device Type:    Video Card
Manufacturer:    AMD
Product Line:    FirePro M Series
Model:             M4000
Description:    FirePro M4000 Chelsea XT GL 1GB Laptop DDR5 Graphics Card
API Supported:    DirectX 11, OpenCL 1.2
Enclosure Type:    Internal
Graphic Processor: AMD FirePro M 4000
Memory Size:       1 GB
Memory Technology: GDDR5 SDRAM
Memory Interface:  128-bit
Core Clock:        600 MHz
Memory Clock:      4500 MHz
Resolution:        2560 x 1600

Good to know.

After a few minutes of gentle rubbing (and about 10 Q-Tips), everything was clean again:



And then I cleaned the matching face on the Dell M4700 heat sink too:



This style of heatsink is called a Heat Pipe, because it is designed to use fancy physics to transfer heat being generated at Point A to a cooling strategy located at Point B.  

Heat Pipes are designed to conduct heat from one place to another using a highly heat-conductive material (in this case, copper) that transports the heat to another place where  it can be dispersed - in this case, via airflow located some some inches away, instead of focusing the airflow directly on the source of the heat itself.  Why?  Mostly because laptops need to be thin, and a vertical cooling strategy (like in a desktop computer) would be too thick.  Laptops need to cool horizontally, not vertically.  That's what a Heat Pipe makes possible.

Anyways, the sorry state of the thermal compound on the GPU make me suspect that the exact same thing had happened to the CPU, so a complete disassembly was probably in the cards once this machine had been brought back to life.


My first task was to get my hands on some fresh thermal compound!


I bought some thermal compound and re-applied, but that didn't work.

I bought a blowtorch and tried to reflow the BGA under the GPU, but that didn't work


So, I went ahead and bought another video card, because the one I found cost less than USD10.00:


Put in commentary about pressing "D" and the power switch to fire off the LCD test to prove that the backlight is working.

Put in commentary about pressing "<Fn>" and the power switch to fire off the DIAGNOSTICS test to  



Sunday, December 22, 2019

eBay Unboxing (DELL PRECISION M4700 Replacement Motherboard)

I received a much-anticipated box from eBay today - in it, a replacement motherboard for my Crazy DELL M4700 PRECISION laptop that decided to go POW! in the middle of an otherwise uneventful work day. Thankfully, I *always* have a spare laptop hanging around, and I keep all of my files on a RAID-5 enabled disk array (my NETGEAR ReadyNAS NV+, the subject of another video series) so I simply put the new laptop into place and kept on truckin'. But the presence of a distressed laptop in the home is a nagging thing. If you are like me, this kind of thing lurks in the psyche and intrudes into one's thoughts at the most inopportune times. It simply must be somehow resolved, one way or another, and propelled toward a final state and related set of actions (fixed and put back into service, or liquidated for parts). Gear simply cannot linger in an unknown state in my home - and if you are watching this, probably not in yours either. So I went online and bought a replacement motherboard on eBay to move this situation to a resolution, and this video shows the unboxing of that video.




Saturday, December 21, 2019

I Used BRASSO To Make My Yellow Headlights Clear Again!

My wife, the reason for the existence of MY MAN THE MAKER, has been complaining about the headlights on our car for quite some time now. Our car is a 2006 Mitsubishi Outlander, and we like it a lot, but the headlights look terrible - they are yellow and frosty and generally unsightly. Mrs. Maker is also a bit worried that the headlights don't shine as brightly as they should because the lenses are so messed up. So, after performing a little bit of research on the Internet, I hit upon using BRASSO as a means of cleaning up yellowed, crazed and otherwise sunburned headlight lenses. Figuring that I couldn't mess them up any worse than they already were, I bought a can of BRASSO for HKD38.00 (USD4.50) and set myself up to give the headlights a good polish. If you also want to try BRASSO on your headlight lenses, give us a hand and use this link: https://amzn.to/2sC4Ifq This video documents me polishing the headlights to crystal clear using just BRASSO, a rag, some masking tape (optional), some scissors and a bottle of water. The results were fantastic. Mrs. MMTM and I are very pleased with the outcome of this little project and we plan to polish the headlights once a year.





named - How To Resolve DNS Server Abuse

CentOS 7 / named


How To Resolve DNS Server Abuse

One day, I noticed that my CentOS system log file (which is located at /var/log/messages) was filled with thousands of messages that looked like this:

16-Dec-2019 08:04:24.451 client: query (cache) '236.31.230.157.in-addr.arpa/PTR/IN' denied
16-Dec-2019 08:04:24.452 client: query (cache) '236.31.230.157.in-addr.arpa/PTR/IN' denied
16-Dec-2019 08:04:24.453 client: query (cache) '236.31.230.157.in-addr.arpa/PTR/IN' denied
16-Dec-2019 08:04:24.454 client: query (cache) '236.31.230.157.in-addr.arpa/PTR/IN' denied

After a little bit of research, I determined that this was a consequence of my Domain Name Server (DNS) server process, named, was being abused.  Anonymous clients around the internet were using my DNS server to resolve IP addresses for any domain, not just the ones that I was responsible for.  

This is a holdover from when the Internet was a kinder and gentler place.  In the old days, a client could connect to any DNS server they wanted to resolve a name to an IP address, because name servers were complicated and rare and bandwidth was relatively scarce.  But, for decades now, Google has maintained two very high-quality and well-publicized DNS servers at 8.8.8.8 and 8.8..4.4, so there's no need for anyone to ask little old me to be resolving a name beyond the domains for which I am responsible.  

In this more cynical and hard-edged Internet age, open DNS servers like mine are being abused to power various nefarious schemes, including spam and denial of service attacks.

So I decided to do something about this situation, and use the fail2ban package to monitor DNS  and block clients who made too many requests within a given period of time.



How To Set Up DNS for Abuse Prevention


The first system to make adjustments to is named, the DNS service offered by bind

Clearing Up Some Naming Convention Confustion

Unfortunately, DNS is a bit of a confusing situation with bunch of different names involved.  It's a long and old story, but the quick version is this:

DNS      - the name resolution standard defined in a document called an Request For Comment (RFC)
bind     - the software package that implemented the standard outlined in the RFC
named   - the name of the running service that offers the services outlined in the RFC


(re)Configuring named


The configuration file for named is located at /etc/named.conf.  It contains a lot of information, but for this exercise we are only going to look at a small amount of settings

1) Turn off named recursion

2) Segregate named logging to a dedicated logfile

How To Turn off named recursion


Recursion in named is the settng that enables anonymous clients to resolve any name using your server.  These days, this is not an advisable setting to turn on, and it is easy to disable:

Here's what my named.conf looked like when I was done:

  /*

   - If your recursive DNS server has a public IP address, you MUST enable access
     control to limit queries to your legitimate users. Failing to do so will
     cause your server to become part of large scale DNS amplification
     attacks. Implementing BCP38 within your network would greatly
     reduce this attack surface

  */

  #
  #  GL    2018-02-02    Uncomment ONE of the following
  #
  recursion no;
# recursion yes;

How To Segregate named Logging To A Dedicated Logfile

The next thing to do is configure named so it logs its messages to a segregated logfile, which makes system monitoring easier in some ways, because the logfile isn't flooded with named related messaging.

Thankfully, there's some instructions embedded in the fail2ban package that help out with this task.  Here they are:

# Fail2Ban filter file for named (bind9).
#

# This filter blocks attacks against named (bind9) however it requires special
# configuration on bind.
#
# By default, logging is off with bind9 installation.
#
# You will need something like this in your named.conf to provide proper logging.
#
# logging {
#     channel security_file {
#         file "/var/log/named/security.log" versions 3 size 30m;
#         severity dynamic;
#         print-time yes;
#     };
#     category security {
#         security_file;
#     };
# };

Let's look at the existing named logging situation:

logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};

So, what's in the data/named.run file?

16-Dec-2019 08:04:24.451 client: query (cache) '236.31.230.157.in-addr.arpa/PTR/IN' denied
16-Dec-2019 08:04:24.452 client: query (cache) '236.31.230.157.in-addr.arpa/PTR/IN' denied
16-Dec-2019 08:04:24.453 client: query (cache) '236.31.230.157.in-addr.arpa/PTR/IN' denied
16-Dec-2019 08:04:24.454 client: query (cache) '236.31.230.157.in-addr.arpa/PTR/IN' denied

So it looks like this file also contains messages related to named.  Let's go ahead and implement the changes required by fail2ban.



Here's what my named.conf file looked like when I was done:

/* 

#  GL  2019-12-16  commented out as an experiment
#
logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};
*/

logging {
  channel security_file {
    file "/var/log/named/security.log" versions 3 size 30m;
    severity dynamic;
    print-time yes;
  };
  category security {
  security_file;
  };
};

SELINUX / CentOS Peculiarities


Because SELINUX is a more stringent security setting than most Linux distributions, a couple of other things need to be done to get this solution working.

Who is DNS?

One of the things that needs to be determined is the security context of the DNS server.  This is established in a couple of places, but the easiest way I know is to scan the /etc/passwd file, which contains a list of all of the users in the system, including services.

And indeed, this line appeared in /etc/passwd:

named:x:25:25:Named:/var/named:/sbin/nologin

Now we know that there is a named user context on this machine.  Next thing is to check the user context of the running service.  This can be done with the ps command:

# ps ax | grep "named"
 5907 ?        Ssl    1:12 /usr/sbin/named -u named -c /etc/named.conf

This output confirms that the username named is being used by DNS service on this machine via the -u parameter (-u named).

Directory Work

OK, with our detective work done, we can go ahead and do the following:

- Create a named subdirectory in the /var/log directory.
- Assign ownership of the named subdirectory to the user context of the DNS server process (named).
- Change the security context of the named subdirectory to that of a named log
- Allow other processes (like fail2ban) permission to enter that directory

Here's how I did it:

# mkdir named
# chown named:named named
# chcon system_u:object_r:named_log_t:s0 named
# chmod 755 named

Here's the result of those actions:

drwxr-xr-x. named  named system_u:object_r:named_log_t:s0 named

Those commands created a directory called named owned (and therefore manipulatable) by the user named (the user context of the DNS server) in the security context of a named log, that may be read and entered into by other processes, like fail2ban.


File Work

Now that the named subdirectory has been prepared, a file needs to be created in it for the named service to write to, and for fail2ban to scan to determine who to ban.

Here's how I did it:

# cd named
# touch security.log
# chown named:named security.log
# chcon system_u:object_r:named_log_t:s0 security.log
# chmod 644 security.log

Here's the result of those actions:

-rw-r--r--. named named system_u:object_r:named_log_t:s0 security.log

Those commands created a file called security.log owned (and therefore manipulatable) by the user named (the user context of the DNS server) in the security context of a named log, that may also be read by other processes like fail2ban.

How To (re)Start The name Service


With all of the preliminary work done, it's time to restart the named service and see if everything was set up properly.

Here's what you will see if there's a configuration problem:

# service named restart
Redirecting to /bin/systemctl restart named.service

Job for named.service failed because the control process exited with error code. See "systemctl status named.service" and "journalctl -xe" for details.

Here's what you will see if everything is OK:

# service named restart
Redirecting to /bin/systemctl restart named.service

#

How To Check The named Server Is Running


Checking to make sure that the named server is running is pretty easy with the service <service_name> status command:

# service named status
Redirecting to /bin/systemctl status named.service
● named.service - Berkeley Internet Name Domain (DNS)
   Loaded: loaded (/usr/lib/systemd/system/named.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2019-12-16 09:43:19 HKT; 5min ago
  Process: 10192 ExecStop=/bin/sh -c /usr/sbin/rndc stop > /dev/null 2>&1 || /bin/kill -TERM $MAINPID (code=exited, status=0/SUCCESS)
  Process: 10208 ExecStart=/usr/sbin/named -u named -c ${NAMEDCONF} $OPTIONS (code=exited, status=0/SUCCESS)
  Process: 10204 ExecStartPre=/bin/bash -c if [ ! "$DISABLE_ZONE_CHECKING" == "yes" ]; then /usr/sbin/named-checkconf -z "$NAMEDCONF"; else echo "Checking of zone files is disabled"; fi (code=exited, status=0/SUCCESS)
 Main PID: 10209 (named)
   CGroup: /system.slice/named.service
           └─10209 /usr/sbin/named -u named -c /etc/named.conf


How To Configure fail2ban To Prevent DNS Server Abuse


Now that we have named running correctly and its messaging related to DNS abusers segregated into a dedicated logfile (/var/log/named/security.log), we can configure fail2ban to scan that logfile and start banning DNS abusers!

The first place to visit is /etc/fail2ban/jail.local, which is where local fail2ban policies reside.  Here's what I found:

# !!! WARNING !!!
# Since UDP is a connection-less protocol, spoofing of IP and imitation of illegal
# actions is way too simple.  Thus enabling of this filter might provide an easy 
# way to implementing a DoS against a chosen victim. 
#
# See:  http://nion.modprobe.de/blog/archives/690-fail2ban-+-dns-fail.html

#
# Please DO NOT USE this jail unless you know what you are doing.
#
# IMPORTANT: see filter.d/named-refused for instructions to enable logging
#
# This jail blocks UDP traffic for DNS requests.
#
# [named-refused-udp]
# enabled  = true
# filter   = named-refused
# port     = domain,953
# protocol = udp
# logpath  = /var/log/named/security.log

# IMPORTANT: see filter.d/named-refused for instructions to enable logging
# This jail blocks TCP traffic for DNS requests.

#
# GL  2019-12-15  Enabled the following
#
[named-refused]
enabled  = true
port     = domain,953
logpath  = /var/log/named/security.log

#
# GL  2019-12-15  Enabled the following
#
[named-refused-udp]
enabled  = true

#
# GL  2019-12-15  Enabled the following
#
[named-refused-tcp]

enabled  = true




How To Resolve fail2ban SELINUX Issues





Dec 16 10:02:17 vm setroubleshoot: SELinux is preventing /usr/bin/python2.7 from read access on the file disable. For complete SELinux messages run: sealert -l 4f0b9ddb-62c0-4568-a19a-ada24825e993

# sealert -l 4f0b9ddb-62c0-4568-a19a-ada24825e993
SELinux is preventing /usr/bin/python2.7 from read access on the file disable.

*****  Plugin catchall (100. confidence) suggests   **************************

If you believe that python2.7 should be allowed read access on the disable file by default.
Then you should report this as a bug.
You can generate a local policy module to allow this access.
Do allow this access for now by executing:
# ausearch -c 'fail2ban-server' --raw | audit2allow -M my-fail2banserver
# semodule -i my-fail2banserver.pp

Additional Information:
Source Context                system_u:system_r:fail2ban_t:s0
Target Context                system_u:object_r:sysfs_t:s0
Target Objects                disable [ file ]
Source                        fail2ban-server
Source Path                   /usr/bin/python2.7
Port                          <Unknown>
Host                          vm.yougrow.net
Source RPM Packages           python-2.7.5-86.el7.x86_64
Target RPM Packages
Policy RPM                    selinux-policy-3.13.1-252.el7_7.6.noarch
Selinux Enabled               True
Policy Type                   targeted
Enforcing Mode                Enforcing
Host Name                     vm.yougrow.net
Platform                      Linux vm.yougrow.net 3.10.0-1062.9.1.el7.x86_64 #1
                              SMP Fri Dec 6 15:49:49 UTC 2019 x86_64 x86_64
Alert Count                   33
First Seen                    2019-12-15 10:24:17 HKT
Last Seen                     2019-12-16 10:02:14 HKT
Local ID                      4f0b9ddb-62c0-4568-a19a-ada24825e993

Raw Audit Messages
type=AVC msg=audit(1576461734.872:303365): avc:  denied  { read } for  pid=12096 comm="fail2ban-server" name="disable" dev="sysfs" ino=2085 scontext=system_u:system_r:fail2ban_t:s0 tcontext=system_u:object_r:sysfs_t:s0 tclass=file permissive=0

type=SYSCALL msg=audit(1576461734.872:303365): arch=x86_64 syscall=open success=no exit=EACCES a0=7fcb234bf110 a1=80000 a2=1b6 a3=24 items=0 ppid=1 pid=12096 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm=fail2ban-server exe=/usr/bin/python2.7 subj=system_u:system_r:fail2ban_t:s0 key=(null)


Hash: fail2ban-server,fail2ban_t,sysfs_t,file,read



Implementing the policies:

# ausearch -c 'fail2ban-server' --raw | audit2allow -M my-fail2banserver
******************** IMPORTANT ***********************
To make this policy package active, execute:

semodule -i my-fail2banserver.pp

# semodule -i my-fail2banserver.pp

#

Behind the scenes,  this was going in in /var/log/messages:

Dec 16 10:04:57 vm systemd: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 12436 (find)
Dec 16 10:04:57 vm systemd: Mounting Arbitrary Executable File Formats File System...
Dec 16 10:04:57 vm systemd: Mounted Arbitrary Executable File Formats File System.
Dec 16 10:37:54 vm kernel: SELinux:  Converting 2406 SID table entries...

Dec 16 10:37:57 vm dbus[953]: [system] Reloaded configuration


Checking fail2ban again now..

DID NOT RESOLVE ISSUE



Rotating The New named Security Log




How To List The Contents Of fail2ban Jails
#!/bin/bash

###########################################################################################
#
# fail2ban-jails.bash
#
# This program lists all of the active jails in the FAIL2BAN system.
#
# GL    2019-12-04      fail2ban-jails.bash
#
###########################################################################################

JAILS=`fail2ban-client status | grep "Jail list" | sed -E 's/^[^:]+:[ \t]+//' | sed 's/,//g'`
for JAIL in $JAILS
do
  fail2ban-client status $JAIL

done

REFERENCES