Predictions of environmental rules (here referred to as "scene grammar") can come in different forms: seeing a toilet in a living room would violate semantic predictions, while finding a toilet brush next to the toothpaste would violate syntactic predictions. The existence of such predictions has usually been investigated by showing observers images containing such grammatical violations. Conversely, the generative process of creating an environment according to one's scene grammar and its effects on behavior and memory has received little attention. In a virtual reality paradigm, we either instructed participants to arrange objects according to their scene grammar or against it. Subsequently, participants' memory for the arrangements was probed using a surprise recall (Exp1), or repeated search (Exp2) task. As a result, participants' construction behavior showed strategic use of larger, static objects to anchor the location of smaller objects which are generally the goals of everyday actions. Further analysis of this scene construction data revealed possible commonalities between the rules governing word usage in language and object usage in naturalistic environments. Taken together, we revealed some of the building blocks of scene grammar necessary for efficient behavior, which differentially influence how we interact with objects and what we remember about scenes.