Selfbuilt's Flashlight Reviews

Extensive comparative analyses of modern LED flashlights

My goal is to provide objective testing results, in a readable form, to let you decide what flashlight best fits your needs. My reviews are posted on candlepowerforums, to facilitate ongoing discussions with the user community.

My flashlights are expensive to feed with all the runtime tests I perform, so I gratefully accept donations to my Paypal battery fund.

Flashlight Resources See my Testing Methods for further resources


Video overviews of each of my flashlights, plus additional background primers.

ANSI FL-1 Standard

Overview of the ANSI/NEMA standard for flashlight testing (FL-1).

Lumen Estimation Method

My lightbox design, and how I have calibrated it for estimated lumens.

Outdoor Beamshots

Overview of all my recent outdoor 100-yard beamshots.

Dog by a stream

Flashlight Testing Methodology

My review thread structure has evolved over time, but my testing methodology has remained largely consistent. This allows you to directly compare output and runtime numbers from one review to another (i.e. an output level of "50" in one review is the same as a "50" in any other). I will explain the testing methodology below, and on the linked pages.

First off, I owe a great deal to the methodology developed by Doug Pribis on his outstanding site. Sadly, this excellent resource is no longer active (the domain name was sold and subsequently updated with unconfirmed material - including some of the original flashlightreviews source material and newer text that appears to be taken from other review sites). But at least the background info still seems to be there for the time being. Note that I am not affiliated with the current or previous owners of in any way - I just respect and admire Doug's work, and have tried to carry on the tradition in my own way, using a similar methodology.

Flashlight testing standards

The first question that comes up is what do all the numbers in my reviews actually mean. Generally, I report on the various characteristics of flashlights using the ANSI/NEMA FL-1 standard. For more information on my specific testing and reporting procedures, please see:

Selfbuilt's ANSI/NEMA FL-1 Standards page

Most of the specific details about how I do my beamshots, runtimes, etc. are included in my actual reviews. There is also a flashlighwiki site that has excellent background information on flashlights:

I am not affiliated with the above site, and information could change.

Starting in August 2012, I am now using a NIST-calibrated lux lightmeter for all beam intensity and distance measures (i.e., "throw" values) in my reviews. For a discussion of how this has affected the presentation of my results, please see:

Revised Selfbuilt beam intensity measures: new NIST-calibrated Extech EA31 lux-meter

Lightbox design and output measures

For a good background explanation on the difference between Lux, Lumens and other output measures, please see the following pages on

Lux Measurements
Overall Output and Throw of Lights Tested

For a good background explanation of the lightbox apparatus, please see the following page on

The Overall Output Measurement Experiment

There are a few differences in my case - for example, my lightbox design is permanently mounted (i.e., the sensor never moves). I also maintain an ongoing calibration of lightbox (using a standardized set of lights, tested at regular intervals). From inception, I can confirm that there is a less than 2% drift per year, which I correct for on at least a monthly basis. So the relative output values you see today are entirely consistent with the results from several years ago.

The point is that the internal standard for my lightbox (and all runtime graphs) is tightly controlled and monitored so that results are directly comparable. If I were to run those lights in my lightbox again (with the current calibration), the graphs would be pretty much indistinguishable (I know, because I do this periodically to confirm the calibration).

As to the actual lumen estimate, that is a different matter - I have never made any claim to its absolute value accuracy. The method I have used to adjust my interally-consistent and calibrated lightbox values to estimated lumens (and note I always refer to them as such) is based on a statisical relationship developed form an extensive series of comparisons, described in detail here:

Selfbuilt's Lighbox-Lumen Conversion page

The point is that the relative value accuracy of my measures remains remarkably high. So, for example, if I estimate one light at 270 lumens and another at 300 lumens, you can feel fairly comfortable with the conclusion that the second light is indeed about 10% brighter. But whether or not that is really 240 and 265 lumens (or 300 and 330 lumens , etc, etc.) I cannot say with any certainty. For that, I am relying on all the results of the 150 or comparison points in the analysis above.

Frankly, there is no way anyone without a properly maintained, properly-sized, NIST-certified, calibrated integrating sphere - used under controlled conditions by a knowledgeable and skilled operator - can asert true, absolute accurate lumens. However, as you will see in the analysis above, I think I have gone to more effort than most in trying to make my estimates as good as they can be.

In any case, I still make no claim to the accurate lumen estimate accuracy. But the runtime graphs remain a well-calibrated and internally-consistent relative set of results from my testing, using only new batteries properly examined for relative performance.

Runtimes explained

For a good background on how to do runtimes, please see the following page on

Runtime Graphs Explained

I have also prepared a video on how to read my runtime graphs on my channel page.

In my own case, I didn't set out to do full product reviews initially - my original goal was simply comparative output/runtime assessments. It is an understandable (and often very valuable) goal to try and reduce complex systems down to just a handful of summary variables to facilitate comparisons. In research, we do this routinely for normally-distributed data, where the average and a consistent measure of variation (e.g., the standard deviation) gives you enough info to meaningfully compare different groups. For non-linear datasets, researchers historically searched for ways to "linearize" this data (e.g. logarithmic transformations, etc.), as this again reduced you to just two variables to compare groups (i.e., slope and y-intercept).

But a fundamental precept here is that it is important to consider ALL relevant information - you can't just ignore critical variables. You can think of it in terms that all data is composed of pattern plus noise. The goal of analysis is to only remove the noise, and to make sure you accurately capture the pattern. It is a huge mistake to ignore the parts of the pattern.

What does this have to do with flashlights? It's much the same for flashlight comparisons - many people would like to reduce flashlights to just a couple of numbers (e.g., output and runtime). But doing this does NOT really give you everything you need to know, given the huge variability in how circuits are programmed. To illustrate, below is a graph of a number of theoretical flashlight patterns that all have the same ANSI FL-1 output and runtime values:

Clearly, these lights above are NOT all the same. The purple line is like some simple incandescent lights that lack a circuit (i.e., rapid drop-off, followed by slow stabilization at the low outputs). The blue line is an example of "direct-drive" with Li-ion cells in LED lights (i.e., the internal resistance of the battery is controlling the decay rate). The yellow line is what you might expect from a perfectly flat-regulated circuit (although this is quite rare). Nowadays, the red line is probably closer to what you could expect from many regulated LED lights, especially on max (i.e., there is a time-delayed drop-off in output after a certain point). But partially regulated lights could also produce other patterns, such as the gray and green lines.

This is why I got into doing output/runtime graphs. The human brain is particularly good at recognizing and comparing patterns visually (although we are also easily fooled - that's a topic for another discussion). The runtime graphs allow you see at a glance how different lights really compare to each other.

By way, if you are interested in a more detailed explanation of the statistical reasoning behind visually inspecting your data, you might enjoy reading about Anscombe's Quartet.

Why do you use a cooling fan for runtimes?

One thing that is critical to keep in mind when comparing runtimes results is whether or not active cooling is being used by the reviewer.

All my runtimes are done under a small cooling fan, for consistency and safety reasons. Even in a climate-controlled environment, I can tell you that the ambient temperature in my office in the morning and evening can be quite different - and quite different from one season to the next. For this reason, I use active fan cooling to provide a more standardized testing environment for the lights. This is important to allow you to accurately compare results. As I do a lot of runtimes - including unsupervised ones overnight (which I don't recommend, especially for Lithium or Li-ion batteries) - I also prefer the safety of knowing the lights are being at least somewhat cooled.

Many will claim that this is not representational of "actual use" of the light, and there is of course merit to that point. But by the same token, it is equally true that doing runtimes with NO cooling is not representational either. In real life, you will typically be carrying the light around, where there will be some active cooling from your hand-holding (i.e., your own circulatory system works to transfer heat away from the light, through the interface of your skin). This is why picking up a light that has been running in isolation for some time (e.g., tailstanding on the ground) can be quite an unpleasent experience with a bare hand - even though it would never have gotten that hot if you had been holding it in your bare hand the whole time.

Also, in "actual use", you may very well be outdoors, where there will typically be movement of air over the light from your own movements, or from the relative climatic conditions (i.e. a windy evening). An isolated high-powered light running inside a lightbox in the corner of a variable ambient temperature office is hardly the same as those "real world" conditions either.

And of course, it gets better than that - the "real world" is highly variable! As I write this, it is late December in Canada. At night, it is generally well below 0 degrees Celcius, and rather windy. That is certainly quite different from the frequent >30 degree Celcius August evenings around here (which can often be quite still, with humidex values reaching much higher). Again, the point for comparative flashlight testing is not to match every possible environment - but to provide as standardized an environment as possible, that falls within a normal range.

Another point to keep in mind is that few people will turn on a fully-charged light and let it run to battery exhaustion in one sitting, as we reviewers do. In most cases, you will be turning the light on and off for short bursts of time. For lights that don't have a timed step-down feature (i.e., ones that have no step-down, or use a thermal step-down control), the fan-cooled runtime data is probably more relevant to helping you gauge overall battery life. For example, I keep a light by the back door that I know runs constantly at a regulated max level for just over 2 hours with a cooling fan (i.e., from continuous runtime testing). I use this light for taking the dog outside at night before bed, and typically spend no more than 2 mins on max each time (i.e., it doesn't have time to heat up and trigger its thermal step-down). As a result, I know I can go for at least 2 months before having to recharge the cell, because this usage pattern more closely matches my runtime testing paradigm.

It is worth considering whether the actively-cooled, continuous-runtime testing results are likely indicative of your actual usage patterns. You just can't expect that any one standardized testing method is going to be directly generalizable to every possible "real world" scenario.

What kind of batteries do you use in your testing?

I go through a LOT of batteries in my testing. And not just primary cells - I go through a lot of rechargeables as well. In fact, I probably spend more on rechargeables than primary cells in any given year.

If you are looking for specific recommendations of flashlights broken down by battery type, please see my Flashlight Recommendations page.

My Li-ion cells are standard protected ICR chemistry (LiCoO2), manufacturered by AW. Fortunately for me, AW still manufacturers and sells his original 2200mAh 18650 cells, 750mAh RCR and 750mAh 14500 - in a consistent fashion. To take the example of 18650, I buy over a dozen new ones every year, usually in batches of 4 every 3-4 months. I go through a few less RCR (and even less 14500). Typically, I discard cells after about 40-50 discharge cycles, max (maybe less).

I verify every new battery that arrives to confirm that it performs within the range of early samples (using a standardized set of lights that I run everything though in my lightbox). Capacity outliers are discarded, although that is pretty rare with AW. His 2200mAh 18650 and 750mAh RCR protected cells have been remarkably consistent in rated capacity over time (14500 less so, but still reasonable - there has been a slight increase in capacity over time). Note that AW 14500 also have higher capacity that his RCR (despite the comparable 750mAh rating).

The same process is true of my NiMH Eneloops. I buy multiple packs of 4xAAA and 4xAA every time they go on sale here, and go through them even faster than Li-ions (probably more like 20-25 cycles on average, maybe less). The reason for this is the occasional over-discharge occurs (i.e. running a light down to off, or nearly there). This is damaging to LSD NiMH - I toss those cells once one of those events occur.

You can learn more about various battery technologies at Battery University, a free online resource.

The discussions on the battery and electronics subforum of candlepowerforums can also be a very useful resource. For example, here is an instructive post by the cpf user Battery Guy, on IMR chemistry (LiMn2O4), and how it relates to other Li-ion chemistries such as ICR (LiCoO2) and IFR (LiFePO4).

I have done some reviews of a few Li-ion brands, but I am really not set up for proper power/current testing. The CPF reviewer HKJ has done extenstive work on this front - please check out his Li-ion battery reviews on CPF.

A related issue with rechargeable batteries is how to best go about charging them. Again, I strongly recommend you check out the battery charging section of Battery University. There you will find a detailed set of pages on the various methods for different types of chemistries.

I have done a number of reviews of individual chargers, and you may find some helpful explanations there - especially, in the detailed discussions that follow in the review threads (i.e., with the input of the more experienced members of CPF).
Those reviews are in chronological order, with the most recent at the top. And again, I recommend you check out the other specific charger reviews at CPF (e.g., HKJ's are excellent).

Finally, I have done some comparison reviews of different types and makes of primary batteries. For more information, please see the following review threads on CPF:

Unfortunately, my flashlights are expensive to feed with all the runtime tests I perform. I don't accept any payment for any of my flashlight reviews, but I will gratefully accept donations to my Paypal battery fund. Your contributions will go toward helping defray the costs of creating all my detailed reviews.

Thanks for your contribution!