Modern mobile phones are powerful processing devices with a host of onboard technologies of interest to navigation system designers. In the absence of Global Navigation Satellite System (GNSS) information, the accelerometers and gyroscopes within a smartphone can be used to provide a relative navigation solution. However, these micro-electro-mechanical systems (MEMS) based sensors suffer from the effects of various errors which cause the inertial-only solution to deteriorate rapidly. As such, there is a need to constrain the inertial positioning solution when long-term navigation is needed. GNSS positions and velocities, and WiFi positions when available, are the most important forms of updates available for the inertial solution. However, updates from these two sources depend on external signals and infrastructure that may not always be available. One attractive source of updates is through the use of a vision sensor. This work describes the development of a vision-based module that determines the device heading misalignment and context based on a sequence of images captured from the device camera. The vision aiding module checks for static periods and calculates the device heading misalignment when in motion. Context classification is assessed for five common use cases: (1) fidgeting the phone while standing still (“fidgeting” context), (2) phone on ear on one floor (“single floor calling” context), (3) phone on ear on stairs (“stairs calling” context), (4) phone in hand on a single floor (“single floor texting” context), and (5) phone in hand on stairs (“stairs texting” context). The module was tested using real-time video and inertial data collected using a Samsung Galaxy S3 smartphone running the Android 4.0 operating system. The results show successful detection of the aforementioned use cases and accurate device angles. Integration of the vision aiding module with a pedestrian dead reckoning (PDR) system shows improvements to the position solution.