In search and rescue scenarios, it is important to find survivors and map their locations quickly and efficiently. This paper presents a multimodal exploration and mapping approach that extends an occupancy grid map formulation to incorporate conditionally dependent sensor observations from multiple sensors and enables reasoning about uncertainty to select maximally informative actions. Temperature from a simulated thermal camera and range from a simulated time- of-flight camera provide updates to spatial and thermal dense voxel maps. The information gain is computed as the sum of the Mutual Information between the depth sensor and spatial map and Conditional Mutual Information between the multimodal sensor and map. Formulating multimodal exploration and mapping in this way results in selecting actions that drive the robot to collect thermal observations of occupied regions and reduce the uncertainty of both the occupancy state and temperature state of the environment. The performance of the proposed methodology is evaluated through simulations with an aerial robot exploring an office room and compared to state- of-art information-theoretic exploration techniques.