Windows – Part 3

Windows doesn’t provide a way for your app to detect the specific device your app is running on. It can tell you the device family (mobile, desktop, etc) the app is running on, the effective resolution, and the amount of screen space available to the app (the size of the app’s window).

If you place navigation elements at the bottom of the screen, they’ll be easier for phone users to access—but most PC users expect to see navigation elements toward the top of the screen.

The “7” in the “Windows 7” product name does not refer to the internal version number, but is rather a generational index . In fact, to minimize application compatibility issues, the version number for Windows 7 is actually 6.1. This allows applications checking for the major version number to continue behaving on Windows 7 as they did on Windows Vista . In fact, Windows 7 and Server 2008 R2 have identical version/build numbers because they were built from the same Windows code base.

None of the .NET Framework runs in kernel mode.

A CPU with a 32-bit register, for example, has a ceiling of 2^32 addresses within the register and is thus limited to accessing 4GB of RAM. This may have seemed like an enormous volume of RAM when they were hashing out register sizes 40 years ago but it’s a rather inconvenient limit for modern computers.

You can’t store a 64-bit address in a 32-bit pointer. So if your code involves passing pointers back and forth to the operating system, as device drivers typically do, it’s not going to end well. Pushes and pops on the stack are always in 8-byte strides, and pointers are 8 bytes wide.

I can see this perhaps being a reason a 32 bit driver might fail on 64 bit as it issues a pop() which pops twice as much data as the driver expected.

sixty-four bits of address space is over 17 billion GB, but current 64-bit hardware limits this to smaller values . And Windows implementation limits in the

current versions of 64-bit Windows further reduce this to 8192 GB (8 TB).
You can’t talk much about Windows internals without referring to the registry because it’s the system database that contains the information required to boot and configure the system, system wide software settings that control the operation of Windows, the security database, and per-user configuration settings (such as which screen saver to use).

Performance monitor and resource monitor

Despite its pervasive use of objects to represent shared system resources, Windows is not an object-oriented system in the strict sense . Most of the operating system code is written in C for portability . The C programming language doesn’t directly support object-oriented constructs such as

dynamic binding of data types, polymorphic functions, or class inheritance . Therefore, the C-based implementation of objects in Windows borrows from, but doesn’t depend on, features of particular object-oriented languages.

The vast majority of Windows is written in C, with some portions in C++ . Assembly language is used only for those parts of the operating system that need to communicate directly with system hardware (such as the interrupt trap handler) or that are extremely performance- sensitive

(such as context switching).

Windows is a symmetric multiprocessing (SMP) operating system . There is no master processor—the operating system as well as user threads can be scheduled to run on any processor.

The checked build is provided primarily to aid device driver developers because it performs more stringent error checking on kernel-mode functions called by device drivers or other system code.


Ntdll .dll is a special system support library primarily for the use of subsystem DLLs . It contains two types of functions:

■ System service dispatch stubs to Windows executive system services

■ Internal support functions used by subsystems, subsystem DLLs, and other native images

Context switching. Although at a high level the same algorithm is used for thread selection and context switching (the context of the previous thread is saved, the context of the new thread is loaded, and the new thread is started), there are architectural differences among the implementations on different processors . Because the context is described by the processor state (registers and so on), what is saved and loaded varies depending on the architecture.

Device drivers in Windows don’t manipulate hardware directly, but rather they call functions in the HAL to interface with the hardware . Drivers are typically written in C (sometimes C++) and therefore, with proper use of HAL routines, can be source-code portable across the CPU architectures supported by Windows and binary portable within an architecture family.


Even the floppy driver has a system thread to poll the floppy device. (Polling is more efficient in this case because an interrupt-driven floppy driver consumes a large amount of system resources.)

For example, listing the services in a Svchost .exe process running under the System account looks like the following:

The default SAS on Windows is the combination Ctrl+Alt+Delete.

The reason for the SAS is to protect users from password-capture programs that simulate the logon process, because this keyboard sequence cannot be intercepted by a user-mode application.

The kernel distinguishes between interrupts and exceptions in the following way . An interrupt is an asynchronous event (one that can occur at any time) that is unrelated to what the processor is executing. Interrupts are generated primarily by I/O devices, processor clocks, or timers, and they can be enabled (turned on) or disabled (turned off).

An exception, in contrast, is a synchronous condition that usually results from the execution of a particular instruction . (Aborts, such as machine checks, is a type of processor exception that’s typically not associated with instruction execution .) Running a program a second time with the same data under the same conditions can reproduce exceptions. Examples of exceptions include memory-access violations, certain debugger instructions, and divide-by-zero errors . The kernel also regards system service calls as exceptions (although technically they’re system traps).

For example, in a multiprocessor system, each processor receives the clock interrupt, but only one processor updates the system clock in response to this interrupt . All the processors, however, use the interrupt to measure thread quantum and to initiate rescheduling when a thread’s quantum ends.

X86 interrupt request levels (IRQLs)

When a high-priority interrupt occurs, the processor saves the interrupted thread’s state and invokes the trap dispatchers associated with the interrupt.

The kernel uses software interrupts to initiate thread scheduling and to asynchronously break into a thread’s execution.

On a multiprocessor system, the kernel allocates and initializes an interrupt object for each CPU, enabling the local APIC on that CPU to accept the particular interrupt.

The secret behind that is the default stack size is 1MB and the user-mode address space assigned to the windows process under 32 bit Windows OS is about 2 GB. that allow around 2000 thread per process (2000 * 1MB = 2GB).

Although hardware generates most interrupts, the Windows kernel also generates software interrupts for a variety of tasks, including these:

■ Initiating thread dispatching

■ Non-time-critical interrupt processing

■ Handling timer expiration

■ Asynchronously executing a procedure in the context of a particular thread

■ Supporting asynchronous I/O operations

A Deferred Procedure Call (DPC) is a Microsoft Windows operating system mechanism which allows high-priority tasks (e.g. an interrupt handler) to defer required but lower-priority tasks for later execution. This permits device drivers and other low-level event consumers to perform the high-priority part of their processing quickly, and schedule non-critical additional processing for execution at a lower priority.

Energy efficiency problems were found.

List all environment variables from command line?

Naturally, non interlocked list operations must not be mixed with interlocked operations .

Because waiting for a spinlock literally stalls a processor, spinlocks can be used only under the following strictly limited circumstances:

■ The protected resource must be accessed quickly and without complicated interactions with other code.

■ The critical section code can’t be paged out of memory, can’t make references to pageable data, can’t call external procedures (including system services), and can’t generate interrupts or exceptions .

Kernel Synchronization Mechanisms

The thread selected by the kernel acquires the mutex object, and all other threads continue waiting. A mutex object can also be abandoned: this occurs when the thread currently owning it becomes terminated . When a thread terminate, the kernel enumerates all mutexes owned by the thread and sets them to the abandoned state, which, in terms of signaling logic, is treated as a signaled state in that ownership of the mutex is transferred to a waiting thread.

The kernel uses a technique known as address ordering to achieve this: because each object has a distinct and static kernel-mode address, the objects can be ordered in monotonically increasing address order, guaranteeing that locks are always acquired and released in the same order by all callers. This means that the caller-supplied array of objects will be duplicated and sorted accordingly.

Windows improves keyed-event performance by using a hash table instead of a linked list to hold the waiter threads . This optimization allows Windows to include three new lightweight user-mode synchronization primitives that all depend on the keyed event.

Releasing the critical section behaves similarly, with bit state changing from 1 to 0 with an interlocked operation.

Two processes cannot use the same critical section to coordinate their operations, nor can duplication or inheritance be used.

When the resource is acquired exclusively by more than one thread, the resource uses the mutex because it permits only one owner.

When the resource is acquired in shared mode by more than one thread, the resource uses a semaphore because it allows multiple owner counts. This level of detail is typically hidden from the programmer, and these internal objects should never be used directly.

Slim Reader/Writer (SRW) Locks

An SRW lock is the size of a pointer. The advantage is that it is fast to update the lock state. The disadvantage is that very little state information can be stored, so SRW locks cannot be acquired recursively. In addition, a thread that owns an SRW lock in shared mode cannot upgrade its ownership of the lock to exclusive mode.

In the more complex scenario when the status is FALSE, this means that the thread lost the race. The thread must undo all the work it did, such as deleting objects or freeing memory, and then call InitOnceBeginInitialize again . However, instead of requesting to start a race as it did initially, it uses the INIT_ONCE_CHECK_ONLY flag, knowing that it has lost, and requests the winner’s context instead (for example, the objects or memory that were created or allocated by the winner) . This returns an other status, which can be TRUE, meaning that the context is valid and should be used or returned to the caller, or FALSE, meaning that initialization failed and nobody has actually been able to perform the work (such as in the case of a low-memory condition, perhaps).

The mechanism for run-once initialization is similar to the mechanism for condition variables and SRW Locks . The init once structure is pointer-size, and inline assembly versions of the SRW acquisition/release code are used for the noncontended case, while keyed events are used when contention has occurred (which happens when the mechanism is used in synchronous mode) and the other threads must wait for initialization.
In the asynchronous case, the locks are used in shared mode, so multiple threads can perform initialization at the same time.

Some device drivers and executive components create their own threads dedicated to processing work at passive level; however, most use system worker threads instead, which avoids the unnecessary scheduling and memory overhead associated with having additional threads in the system.

If the same component is installed and registered both as a 32-bit binary and a 64-bit binary, the last component registered will override the registration of the previous component because they both write to the same location in the registry.

To help solve this problem transparently without introducing any code changes to 32-bit components, the registry is split into two portions: Native and Wow64 . By default, 32-bit components access the 32-bit view and 64-bit components access the 64-bit view. This provides a safe execution environment for 32-bit and 64-bit components and separates the 32-bit application state from the 64-bit one if it exists.

The view of the input and/or output structure is different between the 32-bit application and the 64-bit driver, because pointers are 4 bytes for 32-bit applications and 8 bytes for 64-bit applications.

Wow64 doesn’t support running 16-bit applications . However, because many application installers are 16-bit programs, Wow64 has special case code to make references to certain well-known 16-bit installers work.


One of the main goals behind the design of the Windows hypervisor was to have it as small and modular as possible, much like a microkernel, instead of providing a full, monolithic module. This means that most of the virtualization work is actually done by a separate virtualization stack and that there are also no hypervisor drivers.
The three types of hardware that the hypervisor must manage.


Leave a Reply

Your email address will not be published. Required fields are marked *